Skip to content

melzubeir/mediasystem

Repository files navigation

Socialhose Media System

A comprehensive Qt/C++ desktop application for print media clipping, OCR processing, and digital content management. The system provides professional tools for scanning, processing, and organizing print media articles with advanced OCR capabilities and modern API integration.

Project Overview

The Socialhose Media System is a full-featured print news clipping platform designed for media monitoring organizations, news agencies, and content management teams. It combines powerful desktop applications with modern REST API integration for seamless workflow management.

Key Components

  • ClippingStation: Main GUI application for scanning, clipping, and managing print media articles
  • emsOCR: High-performance OCR engine with crash recovery and auto-restart capabilities
  • Socialhose API Integration: Modern REST API backend for campaign management and data synchronization

Target Platforms

  • Windows (Primary for OCR Server)
  • Linux (Ubuntu/Debian - Primary Clipping Station)
  • macOS (Limited support)

Architecture

System Architecture

┌─────────────────┐    ┌─────────────────┐   
│  ClippingStation│    │    emsOCR       │    
│  (Main GUI)     │◄──►│  (OCR Engine)   │   
│  Qt/C++ Desktop │    │  Process-based  │   
└─────────────────┘    └─────────────────┘   
         │                       │           
         ▼                       ▼           
┌─────────────────────────────────────────────────────────────────┐
│                    Data Layer                                   │
├─────────────────┬─────────────────┬─────────────────────────────┤
│   MySQL DB      │  Socialhose API │    File System              │
│   (Legacy)      │  (New Backend)  │    (Images/Cache)         │
└─────────────────┴─────────────────┴─────────────────────────────┘

Technology Stack

  • Frontend: Qt 6 (C++) with Qt Widgets
  • Backend: MySQL Database (Legacy) + Socialhose REST API (New)
  • OCR: IRIS iDRS OCR engine with advanced Arabic and English text recognition
  • Build System: qmake/Visual Studio
  • Platforms: Cross-platform desktop application

Database Architecture

The system uses MySQL with a comprehensive schema supporting:

  • Articles: Main article storage with OCR text, metadata, and clipping information
  • Publications: Publication management and issue tracking
  • Users: Authentication, permissions, and user profiles
  • Coordinates: Precise clipping regions and OCR word positioning
  • Tags: Article categorization and keyword management
  • Campaigns: Socialhose API integration for modern workflow management

Features

Core Functionality

  • Print Media Scanning: High-resolution scanning with automatic page detection
  • Smart Clipping: Intuitive drag-and-drop interface for article extraction
  • Advanced OCR: Multi-engine OCR with Arabic and English text recognition
  • Intelligent Tagging: Automatic keyword extraction and manual categorization
  • Analytics: Article metrics, circulation data, and impression tracking
  • Multi-user Support: Role-based access control with user authentication

Advanced Features

  • Intelligent Caching: Memory-efficient image caching with thumbnail generation
  • Network Image Support: Remote image loading with local caching
  • Real-time Processing: Background OCR processing with progress monitoring
  • API Integration: Modern REST API for cloud synchronization
  • Crash Recovery: Automatic OCR engine restart with Windows crash dump generation
  • Precision Tools: Zoom, rotation, and fine-tuning controls for accurate clipping

Socialhose API Integration (New)

  • Campaign Management: Create and manage monitoring campaigns
  • Mention Tracking: Track articles as social media mentions
  • Keyword Monitoring: Real-time keyword and brand monitoring
  • Organization Support: Multi-tenant organization-based access
  • JWT Authentication: Secure token-based authentication
  • Hybrid Data Mode: Seamless switching between API and local database

🛠️ Build Instructions

Prerequisites

  • Qt 6.x (Successfully migrated from Qt 4)
  • MySQL Server 5.7+ or MariaDB
  • C++ Compiler (MSVC, GCC, or Clang)
  • CMake 3.16+ or qmake

Quick Build (Qt 6)

# Clone the repository
git clone <repository-url>
cd mediasystem

# Build ClippingStation (Main Application)
cd ClippingStation
qmake ClippingStation.pro
make

# Build emsOCR (OCR Engine)
cd ../emsOCR
qmake emsOCR.pro
make

# Build OurOCREngine (Alternative OCR)
cd ../OurOCREngine
qmake OurOCREngine.pro
make

Windows Build (Visual Studio)

# Open solution files in Visual Studio
ClippingStation/ClippingStation.sln
emsOCR/emsOCR.sln
OurOCREngine/OurOCREngine.sln

# Build using Visual Studio IDE

Qt Creator Build

# Open project files in Qt Creator
ClippingStation/ClippingStation.pro
emsOCR/emsOCR.pro
OurOCREngine/OurOCREngine.pro

# Build and run from Qt Creator IDE

Clean Build

# Clean previous builds
make clean

# Rebuild from scratch
qmake
make

Database Setup

MySQL Database Configuration

-- Create database
CREATE DATABASE socialhose CHARACTER SET utf8 COLLATE utf8_unicode_ci;

-- Import schema
mysql -u root -p socialhose < db_schema.sql

-- Configure connection in config.ini
[master database]
server=localhost
port=3306
database=socialhose
uid=root
pwd=your_password

Configuration File (config.ini)

[master database]
server=localhost
port=3306
database=socialhose
uid=root
pwd=your_password

[api]
base_url=http://localhost:8001
timeout=30000
data_source=hybrid
enabled=true
remember_login=false

[shortcuts]
Clip=Ctrl+S
Preclip=Ctrl+D
AddPage=Ctrl+A
NextPage=Ctrl+E
PreviousPage=Ctrl+W

[debug mode]
debug=on

Development Guidelines

Code Architecture

  • MVC Pattern: Models, Views, and Controllers clearly separated
  • Repository Pattern: Data access abstraction with API/Database dual mode
  • Signal-Slot Mechanism: Qt's event-driven communication pattern
  • Process Isolation: OCR engines run as separate processes for stability

Key Classes and Components

  • ClippingStation: Main application window and coordinator
  • FullPageView: High-resolution image viewer with clipping tools
  • DrawerView: Article composition and layout management
  • OcrThread: Background OCR processing with progress tracking
  • ImageCache: Memory-efficient image caching system
  • SocialhoseApiClient: Modern REST API client with JWT authentication

Performance Optimization

  • Image Caching: Multi-level caching (memory, disk, network)
  • Lazy Loading: On-demand image and data loading
  • Background Processing: Non-blocking OCR and network operations
  • Memory Management: Qt's parent-child object hierarchy for automatic cleanup

Integration Testing

# Test API connectivity
cd ClippingStation
cp test_api_client.cpp main.cpp
qmake && make
./ClippingStation

# Verify database connectivity
mysql -u root -p socialhose -e "SELECT COUNT(*) FROM articles;"

Project Structure

mediasystem/
├── ClippingStation/          # Main GUI application
│   ├── api/                  # Socialhose API integration (TBD)
│   │   ├── dto/              # Data Transfer Objects
│   │   └── repositories/     # Repository pattern implementation
│   ├── clippingstation.*     # Main application files
│   ├── fullpageview.*        # Image viewer component
│   ├── drawerview.*          # Article composition tool
│   └── config.ini            # Application configuration
├── emsOCR/                   # Primary OCR engine
│   ├── main.cpp              # Process entry point
│   └── emsocrdialog.*        # OCR processing dialog
├── db_schema.sql             # Complete database schema
└── README.md                 # This file

API Documentation

Socialhose API Integration (PLANNED)

The system now supports modern REST API integration with the Socialhose service:

Authentication: JWT-based with automatic token refresh Endpoints: Full CRUD operations for campaigns, mentions, and keywords Data Sources: Migrate away from direct SQL access to API Offline Support: Graceful fallback to local database when offline

Troubleshooting

Common Issues

  1. Qt 6 Migration: Project successfully migrated, use Qt 6.x only
  2. Database Connection: Check MySQL service and config.ini settings
  3. OCR Engine Crashes: Automatic restart enabled, check MiniDump.dmp for details
  4. API Connection: Verify base_url in config.ini and network connectivity

Debug Mode

Enable debug logging in config.ini:

[debug mode]
debug=on

Support

  • Examine configuration: ClippingStation/config.ini When all else fails, email [email protected]

License

This project is licensed under the GNU General Public License v3.0 (GPLv3).
See the LICENSE file for full license details.

About

Cross-platform Media System for Scanning and Clipping Print Media

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages