Skip to content

KeithCu/LinuxReport

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1,646 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LinuxReport - Multi-Platform News Aggregation

LinuxReport logo CovidReport logo AIReport logo

Simple, fast, and intelligent news aggregation platform built with Python/Flask. Designed as a modern drudgereport.com clone that automatically aggregates and curates news from multiple categories, updated 24/7 with AI-powered headline generation.

This project is free and open source software released under the GNU Lesser General Public License v3.0 (LGPL v3).

Ask DeepWiki

DeepWiki provides excellent analysis of the codebase, including visual dependency graphs.

🌐 Live Sites

Category URL Focus
Linux linuxreport.net Linux news, open source, tech
COVID covidreport.org Health, pandemic updates
AI aireport.keithcu.com Artificial intelligence, ML
Solar/PV pvreport.org Solar energy, renewable tech
Techno news.thedetroitilove.com Detroit techno music
Space news.spaceelevatorwiki.com Space exploration

✨ Key Features

  • πŸš€ High performance with thread pools and efficient caching
  • πŸ€– AI-powered headlines via OpenRouter.ai using a curated set of reliable models
  • 🎯 Multi-site support: multiple news categories from one shared codebase
  • πŸŒ™ Dark mode, font controls, and mobile-friendly layout
  • ⚑ Multi-layer caching and optional CDN for fast responses
  • πŸ”’ Security best practices: rate limiting, admin auth, config-based secrets
  • πŸ› οΈ Easy configuration of feeds and report types

🧠 AI-Powered Headlines

LinuxReport uses LLMs via OpenRouter.ai to generate and refine headlines.

  • Uses multiple high-quality models; failures fall back to a reliable default.
  • Logic is implemented in auto_update.py (model selection and retries).

πŸš€ Quick Start

# Clone the repository
git clone https://github.com/KeithCu/LinuxReport
cd LinuxReport

# Option 1: Modern approach with uv (recommended - 10-100x faster)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync

# Option 2: Traditional approach with pip
pip install -r requirements.txt

# For CPU-only PyTorch and ML dependencies (optional, for auto-update features)
# Run this script to install CPU versions of PyTorch/sentence-transformers to save space:
./install_cpu_ml_deps.sh

# Configure (see Configuration section below)
cp config.yaml.example config.yaml
# Edit config.yaml with your settings

# Run development server
uv run python -m flask run
# Or with pip: python -m flask run

πŸ—οΈ Architecture Overview

High-level design:

  • Backend:
    • Python 3.x + Flask.
    • Background workers for scraping and updating feeds.
  • Storage and caching:
    • SQLite via Diskcache for persistent, high-performance caching.
    • In-memory cache for hot data.
    • File-based HTML snippets for AI-generated headline sections.
  • Frontend:
    • Jinja2 templates with modular JS/CSS.
    • Bundled/minified assets for production.
  • Scraping:
    • feedparser + BeautifulSoup4 for most sites.
    • Optional Selenium + Tor for complex/JS-heavy or privacy-sensitive sources.
  • Images:
    • Automatic optimization and WebP support.

πŸ“‹ Configuration

  1. Copy and edit config.yaml:
    • Set a strong admin password and secret_key.
    • Configure allowed_domains and any deployment-specific settings.
  2. Configure report types:
    • Edit *_report_settings.py to define feeds, titles, and behavior for each site.
  3. For production:
    • Use httpd-vhosts-sample.conf or equivalent web server configuration as a starting point.

πŸ”§ Development

Project Structure (essential only)

  • app.py: Flask application setup and configuration.
  • routes.py: Main routing and request handling.
  • shared.py: Shared utilities, feature flags, and caches.
  • workers.py: Background feed processing.
  • auto_update.py: AI headline generation and scheduling.
  • *_report_settings.py: Report-specific configuration.
  • templates/: Jinja2 templates and modular JS/CSS (edit here).
  • static/: Bundled assets and images (do not hand-edit generated bundles).
  • tests/: pytest suite.
  • config.yaml: Runtime configuration.

Developer Notes

  • JS/CSS:
    • Edit source files in templates/; they are bundled into static/linuxreport.js and static/linuxreport.css.
  • Caching:
    • Multi-layer caching is central to performance.
    • See Caching.md or agents.md for deeper technical details.
  • Tests:
    • Use pytest to validate changes.

πŸ“– Documentation

  • agents.md: Technical guide for AI agents and contributors.
  • Caching.md: Detailed caching and performance internals.
  • ROADMAP.md: Planned features and improvements.
  • Scaling.md: Scaling and performance notes.

πŸ”’ Security

Admin Mode Protection

Admin functionality is protected by authentication:

# config.yaml
admin:
  password: "CHANGE_THIS_DEFAULT_PASSWORD"

⚠️ IMPORTANT: Change the default password immediately after installation!

Security Features

  • Rate Limiting: Configurable per-endpoint throttling
  • Input Validation: Secure file uploads and form processing
  • CORS Protection: Configurable domain allowlists
  • Security Headers: XSS protection, content type validation
  • IP Blocking: Persistent banned IP storage

πŸš€ Production Deployment (quick overview)

  • Use a WSGI-capable web server (e.g., Apache with mod_wsgi, or gunicorn/uwsgi + nginx).
  • Use httpd-vhosts-sample.conf as a reference if deploying with Apache.
  • Run background tasks (e.g., headline updates) via systemd timers or cron:
    • Example units/scripts are provided; adjust paths and commands for your environment.

🀝 Contributing

We welcome contributions! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Run tests: pytest tests/
  4. Submit a pull request

Feel free to request new RSS feeds or suggest improvements.

πŸ“ˆ Performance (summary)

LinuxReport is designed to be fast in real-world deployments:

  • Multi-layer caching minimizes database reads and external calls.
  • Concurrent processing handles many feeds efficiently.
  • Works well with multi-process setups; each process uses its own in-memory cache on top of shared persistent cache.

πŸ“„ License

This project is free and open source software released under the GNU Lesser General Public License v3.0 (LGPL v3). See the LICENSE file for complete details.

CDN and Static Asset Delivery

  • Optional CDN/object storage integration via s3cmd.
  • Long cache headers for static assets.
  • Configuration driven from config.yaml.

Built with ❀️ for the free and open source community

About

Lightning-fast news sites based on Python / Flask

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors