A comprehensive Qt/C++ desktop application for print media clipping, OCR processing, and digital content management. The system provides professional tools for scanning, processing, and organizing print media articles with advanced OCR capabilities and modern API integration.
The Socialhose Media System is a full-featured print news clipping platform designed for media monitoring organizations, news agencies, and content management teams. It combines powerful desktop applications with modern REST API integration for seamless workflow management.
- ClippingStation: Main GUI application for scanning, clipping, and managing print media articles
- emsOCR: High-performance OCR engine with crash recovery and auto-restart capabilities
- Socialhose API Integration: Modern REST API backend for campaign management and data synchronization
- Windows (Primary for OCR Server)
- Linux (Ubuntu/Debian - Primary Clipping Station)
- macOS (Limited support)
┌─────────────────┐ ┌─────────────────┐
│ ClippingStation│ │ emsOCR │
│ (Main GUI) │◄──►│ (OCR Engine) │
│ Qt/C++ Desktop │ │ Process-based │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ Data Layer │
├─────────────────┬─────────────────┬─────────────────────────────┤
│ MySQL DB │ Socialhose API │ File System │
│ (Legacy) │ (New Backend) │ (Images/Cache) │
└─────────────────┴─────────────────┴─────────────────────────────┘
- Frontend: Qt 6 (C++) with Qt Widgets
- Backend: MySQL Database (Legacy) + Socialhose REST API (New)
- OCR: IRIS iDRS OCR engine with advanced Arabic and English text recognition
- Build System: qmake/Visual Studio
- Platforms: Cross-platform desktop application
The system uses MySQL with a comprehensive schema supporting:
- Articles: Main article storage with OCR text, metadata, and clipping information
- Publications: Publication management and issue tracking
- Users: Authentication, permissions, and user profiles
- Coordinates: Precise clipping regions and OCR word positioning
- Tags: Article categorization and keyword management
- Campaigns: Socialhose API integration for modern workflow management
- Print Media Scanning: High-resolution scanning with automatic page detection
- Smart Clipping: Intuitive drag-and-drop interface for article extraction
- Advanced OCR: Multi-engine OCR with Arabic and English text recognition
- Intelligent Tagging: Automatic keyword extraction and manual categorization
- Analytics: Article metrics, circulation data, and impression tracking
- Multi-user Support: Role-based access control with user authentication
- Intelligent Caching: Memory-efficient image caching with thumbnail generation
- Network Image Support: Remote image loading with local caching
- Real-time Processing: Background OCR processing with progress monitoring
- API Integration: Modern REST API for cloud synchronization
- Crash Recovery: Automatic OCR engine restart with Windows crash dump generation
- Precision Tools: Zoom, rotation, and fine-tuning controls for accurate clipping
- Campaign Management: Create and manage monitoring campaigns
- Mention Tracking: Track articles as social media mentions
- Keyword Monitoring: Real-time keyword and brand monitoring
- Organization Support: Multi-tenant organization-based access
- JWT Authentication: Secure token-based authentication
- Hybrid Data Mode: Seamless switching between API and local database
- Qt 6.x (Successfully migrated from Qt 4)
- MySQL Server 5.7+ or MariaDB
- C++ Compiler (MSVC, GCC, or Clang)
- CMake 3.16+ or qmake
# Clone the repository
git clone <repository-url>
cd mediasystem
# Build ClippingStation (Main Application)
cd ClippingStation
qmake ClippingStation.pro
make
# Build emsOCR (OCR Engine)
cd ../emsOCR
qmake emsOCR.pro
make
# Build OurOCREngine (Alternative OCR)
cd ../OurOCREngine
qmake OurOCREngine.pro
make# Open solution files in Visual Studio
ClippingStation/ClippingStation.sln
emsOCR/emsOCR.sln
OurOCREngine/OurOCREngine.sln
# Build using Visual Studio IDE# Open project files in Qt Creator
ClippingStation/ClippingStation.pro
emsOCR/emsOCR.pro
OurOCREngine/OurOCREngine.pro
# Build and run from Qt Creator IDE# Clean previous builds
make clean
# Rebuild from scratch
qmake
make-- Create database
CREATE DATABASE socialhose CHARACTER SET utf8 COLLATE utf8_unicode_ci;
-- Import schema
mysql -u root -p socialhose < db_schema.sql
-- Configure connection in config.ini
[master database]
server=localhost
port=3306
database=socialhose
uid=root
pwd=your_password[master database]
server=localhost
port=3306
database=socialhose
uid=root
pwd=your_password
[api]
base_url=http://localhost:8001
timeout=30000
data_source=hybrid
enabled=true
remember_login=false
[shortcuts]
Clip=Ctrl+S
Preclip=Ctrl+D
AddPage=Ctrl+A
NextPage=Ctrl+E
PreviousPage=Ctrl+W
[debug mode]
debug=on- MVC Pattern: Models, Views, and Controllers clearly separated
- Repository Pattern: Data access abstraction with API/Database dual mode
- Signal-Slot Mechanism: Qt's event-driven communication pattern
- Process Isolation: OCR engines run as separate processes for stability
- ClippingStation: Main application window and coordinator
- FullPageView: High-resolution image viewer with clipping tools
- DrawerView: Article composition and layout management
- OcrThread: Background OCR processing with progress tracking
- ImageCache: Memory-efficient image caching system
- SocialhoseApiClient: Modern REST API client with JWT authentication
- Image Caching: Multi-level caching (memory, disk, network)
- Lazy Loading: On-demand image and data loading
- Background Processing: Non-blocking OCR and network operations
- Memory Management: Qt's parent-child object hierarchy for automatic cleanup
# Test API connectivity
cd ClippingStation
cp test_api_client.cpp main.cpp
qmake && make
./ClippingStation
# Verify database connectivity
mysql -u root -p socialhose -e "SELECT COUNT(*) FROM articles;"mediasystem/
├── ClippingStation/ # Main GUI application
│ ├── api/ # Socialhose API integration (TBD)
│ │ ├── dto/ # Data Transfer Objects
│ │ └── repositories/ # Repository pattern implementation
│ ├── clippingstation.* # Main application files
│ ├── fullpageview.* # Image viewer component
│ ├── drawerview.* # Article composition tool
│ └── config.ini # Application configuration
├── emsOCR/ # Primary OCR engine
│ ├── main.cpp # Process entry point
│ └── emsocrdialog.* # OCR processing dialog
├── db_schema.sql # Complete database schema
└── README.md # This file
The system now supports modern REST API integration with the Socialhose service:
Authentication: JWT-based with automatic token refresh Endpoints: Full CRUD operations for campaigns, mentions, and keywords Data Sources: Migrate away from direct SQL access to API Offline Support: Graceful fallback to local database when offline
- Qt 6 Migration: Project successfully migrated, use Qt 6.x only
- Database Connection: Check MySQL service and config.ini settings
- OCR Engine Crashes: Automatic restart enabled, check MiniDump.dmp for details
- API Connection: Verify base_url in config.ini and network connectivity
Enable debug logging in config.ini:
[debug mode]
debug=on- Examine configuration:
ClippingStation/config.iniWhen all else fails, email [email protected]
This project is licensed under the GNU General Public License v3.0 (GPLv3).
See the LICENSE file for full license details.