This document outlines the different caching strategies employed within the codebase to improve performance and reduce redundant computations or data fetching.
- Location: Initialized in
shared.pyasg_cm = Cache(). - Mechanism: Uses an in-memory cache, the
cacheoutlibrary. - Uses:
- Full Page Caching: (
routes.py) Stores the entire rendered HTML output of the main page. - RSS Template Caching: (
routes.py,workers.py) Caches templates fetched based on site URLs (rss_info.site_url). Cache entries are deleted by the worker when a feed is updated. That doesn't work across threads, so at page rendering time, we check to see if the last_fetch time is different from last_render and if so, then regenerate the template.
- Full Page Caching: (
The application utilizes the diskcache library (accessed via the g_c object, storing cached data persistently on disk using an SQLite backend (indicated by cache.db* files).
-
Weather Data Caching:
- Location:
weather.py - Mechanism: Uses the generic disk cache (
g_cs). This is a shared cache among all instances on the server. Weather data fetched from an API is stored. - Keying: Keys are generated based on latitude and longitude, which are "bucketed" (grouped into nearby regions) to increase cache hits for close locations. Keys use the prefix
weather:cache_entry:. - Expiration: Cached entries have a timeout (
WEATHER_CACHE_TIMEOUT, currently 4 hours) and are also validated against the current date to ensure freshness. - Concurrency: Uses
DiskcacheSqliteLockto prevent multiple processes/threads from fetching data for the same location simultaneously.
- Location:
-
Chat/Comments Caching:
- Location:
routes.py - Mechanism: Uses the disk cache retrieved via
get_chat_cache()function, which returns eitherg_c(instance-specific cache) org_cs(shared cache across instances) based on theUSE_SHARED_CACHE_FOR_CHATconfiguration value. - Purpose: Stores chat comments (associated with
COMMENTS_KEY) and a set of banned IP addresses (BANNED_IPS_KEY) persistently on disk. - Persistence: Unlike in-memory caches, this data remains available across application restarts, enabling long-term storage of comments and banned IPs.
- Location:
-
Feed Fetching Synchronization:
- Location:
workers.py - Mechanism: Uses
DiskcacheSqliteLockwrappingg_c's lock. - Purpose: While not explicitly showing data caching with
g_chere, it uses locks provided bydiskcacheto ensure that only one worker process attempts to fetch a specific feed URL at any given time. This prevents redundant network requests.
- Location:
- AI-Generated Headlines Caching:
- Location:
caching.py(viaget_cached_file_content()function used byget_cached_above_html()inroutes.py) - Mechanism: Uses a simple in-memory dictionary (
_file_cache) that stores file content along with modification timestamps. - Process:
- Stores file content with its last modified time (
mtime) and last check time. - Only re-reads the file from disk when:
- The file isn't in the cache
- The file's modification time has changed
- The check interval has passed (
_FILE_CHECK_INTERVAL_SECONDS, set to 5 minutes)
- Stores file content with its last modified time (
- Purpose: Caches the content of the
{mode}reportabove.htmlfile (e.g.,linuxreportabove.html,aireportabove.html), which contains AI-generated featured headlines that appear at the top of the page. - Management: Can be edited by administrators through the config page and is programmatically generated through the
html_generation.pymodule which renders headline templates with article titles and images. - Invalidation: Cache entries are explicitly invalidated when an admin saves new headlines by removing the entry from
_file_cachedirectly, forcing a re-read on the next request. - Key difference from disk cache: This is a separate caching mechanism from
diskcacheand is primarily aimed at reducing disk I/O for frequently accessed but rarely changing files.
- Location:
- Location:
app.py - Mechanism: Uses Flask-Assets for automatic asset bundling, minification, and cache busting.
- Process:
- JavaScript Bundling: Combines modular JS files (
core.js,weather.js,chat.js,config.js) from thetemplatesdirectory into a singlelinuxreport.jsfile in thestaticdirectory during application startup. - CSS Management: Provides cache busting for
linuxreport.csswithout modifying the original file. - Custom Header Filter: Adds compilation metadata (timestamp, hash, source files) to bundled JavaScript files.
- Conditional Minification: JavaScript is only minified in production mode (when
DEBUG=Falseand Flask debug mode is off), providing unminified code for easier debugging in development. - Automatic Cache Busting: Flask-Assets generates unique URLs with version parameters that automatically update when files change.
- Template Integration: Assets are made available to templates via
{% assets %}template tags, which automatically generate the correct URLs.
- JavaScript Bundling: Combines modular JS files (
- Benefits:
- Development Friendly: Unminified JavaScript in debug mode for easier debugging.
- Production Optimized: Minified JavaScript and CSS for faster loading in production.
- Automatic Updates: Changes to source files result in new cache-busting URLs, ensuring clients load the latest version.
- Standard Tooling: Uses Flask-Assets, a well-maintained library for asset management.
- No Manual Build Steps: Assets are built automatically on application startup.
The application employs multiple caching layers:
cacheout(g_cm): For process-local in-memory caching with TTL and size limits (e.g., full pages, RSS templates).diskcache(g_c): For persistent disk-based caching shared across processes (e.g., weather data, chat comments, banned IPs) and for providing cross-process locking mechanisms (feed fetching, weather fetching).- File-based caching (
_file_cache): For caching content of frequently accessed files (AI-generated headlines) with modification time tracking to avoid unnecessary disk reads. - Flask-Assets: For automatic asset bundling, minification, and cache busting of JavaScript and CSS files, with conditional minification based on debug mode.