`titor`

A high-performance checkpointing library for Rust that enables time-travel capabilities through directory snapshots. Titor provides efficient incremental backups with cryptographic verification, content deduplication, and parallel processing.

Features

Core Capabilities

Incremental Snapshots: Only changed files are stored between checkpoints, minimizing storage overhead
Content-Addressable Storage: Automatic deduplication across all checkpoints using SHA-256 hashing
LZ4 Compression: Extreme-speed compression (500+ MB/s) with adaptive strategies
Parallel Processing: Multi-threaded file scanning and hashing leveraging all available CPU cores
Merkle Tree Verification: Cryptographic integrity checking for tamper detection
Timeline Branching: Git-like branching model supporting multiple independent timelines
Atomic Operations: All checkpoint and restore operations are atomic with rollback on failure
Memory Efficient: Streaming architecture for large files without full memory loading
Line-Level Diffs: Git-like unified diff output showing exactly what changed between checkpoints

Technical Specifications

Compression: LZ4 via lz4_flex crate with 4+ GB/s decompression speeds
Hashing: SHA-256 for content identification and verification
Serialization: Bincode for efficient binary storage of metadata
Concurrency: Rayon-based parallel processing with configurable worker pools
Storage: Sharded object storage with reference counting for garbage collection

Architecture

System Overview

Titor implements a content-addressable storage system with the following components:

Storage Layer: Manages compressed object storage with automatic deduplication
Checkpoint System: Creates and manages independent snapshots with metadata
Timeline Manager: Maintains DAG structure of checkpoints with branching support
Verification Engine: Provides cryptographic integrity checking via Merkle trees
Compression Engine: Adaptive LZ4 compression with configurable strategies

Storage Layout

storage_root/
├── metadata.json          # Storage configuration and version info
├── timeline.json          # Timeline DAG structure  
├── checkpoints/           # Checkpoint metadata directory
│   └── {checkpoint_id}/   
│       ├── metadata.json  # Checkpoint metadata (size, timestamps, etc.)
│       └── manifest.bin   # Binary manifest of all files (bincode format)
├── objects/               # Content-addressable object storage
│   └── {prefix}/          # Two-character sharding for performance
│       └── {hash}         # LZ4-compressed file content
└── refs/                  # Reference counting for garbage collection
    └── {object_hash}      # Reference count per object

Object Storage

Files are stored using content-based addressing:

SHA-256 hash computed for file content
Objects sharded by first 2 hash characters for filesystem performance
Reference counting enables safe garbage collection
Compression applied based on configurable strategies

Installation

Add to your Cargo.toml:

[dependencies]
titor = "0.2.0"

Or install the CLI tool with:

cargo install titor

Dependencies

Required system dependencies:

Rust 1.70+ (for stable async traits)
Platform-specific filesystem capabilities for symbolic links

Quick Start

Basic Usage

use titor::{Titor, TitorBuilder, CompressionStrategy};
use std::path::PathBuf;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize with adaptive compression
    let mut titor = TitorBuilder::new()
        .compression_strategy(CompressionStrategy::Adaptive {
            min_size: 4096,
            skip_extensions: vec!["jpg".to_string(), "mp4".to_string()],
        })
        .parallel_workers(8)
        .build(
            PathBuf::from("/path/to/project"),
            PathBuf::from("/path/to/project/.titor")
        )?;

    // Create checkpoint
    let checkpoint = titor.checkpoint(Some("Initial state".to_string()))?;
    println!("Created checkpoint: {}", checkpoint.id);

    // Restore to checkpoint
    titor.restore(&checkpoint.id)?;

    Ok(())
}

Line-Level Diff Example

use titor::types::DiffOptions;

// Get detailed diff with line-level changes
let options = DiffOptions {
    context_lines: 3,
    ignore_whitespace: false,
    show_line_numbers: true,
    max_file_size: 10 * 1024 * 1024, // 10MB
};

let detailed_diff = titor.diff_detailed(&checkpoint1.id, &checkpoint2.id, options)?;

// Display results
println!("Total lines added: {}", detailed_diff.total_lines_added);
println!("Total lines deleted: {}", detailed_diff.total_lines_deleted);

for file_diff in &detailed_diff.file_diffs {
    println!("\nFile: {:?}", file_diff.path);
    
    for hunk in &file_diff.hunks {
        println!("@@ -{},{} +{},{} @@", 
            hunk.from_line, hunk.from_count,
            hunk.to_line, hunk.to_count);
            
        for change in &hunk.changes {
            match change {
                LineChange::Added(_, line) => println!("+{}", line),
                LineChange::Deleted(_, line) => println!("-{}", line),
                LineChange::Context(_, line) => println!(" {}", line),
            }
        }
    }
}

Advanced Configuration

let titor = TitorBuilder::new()
    .compression_strategy(CompressionStrategy::Custom(Arc::new(|path, size| {
        // Custom compression logic
        size > 1024 && !path.extension().map_or(false, |ext| ext == "zip")
    })))
    .ignore_patterns(vec![
        "*.tmp".to_string(),
        "node_modules/**".to_string(),
    ])
    .max_file_size(100 * 1024 * 1024) // 100MB limit
    .follow_symlinks(false)
    .parallel_workers(num_cpus::get())
    .build(root_path, storage_path)?;

API Documentation

Core Types

`Titor`

The main interface for checkpoint operations.

impl Titor {
    /// Create a new checkpoint capturing current directory state
    pub fn checkpoint(&mut self, description: Option<String>) -> Result<Checkpoint>;
    
    /// Restore directory to a specific checkpoint state
    pub fn restore(&mut self, checkpoint_id: &str) -> Result<RestoreResult>;
    
    /// List all checkpoints in chronological order
    pub fn list_checkpoints(&self) -> Result<Vec<Checkpoint>>;
    
    /// Get timeline DAG structure
    pub fn get_timeline(&self) -> Result<Timeline>;
    
    /// Create a new branch from existing checkpoint
    pub fn fork(&mut self, checkpoint_id: &str, description: Option<String>) -> Result<Checkpoint>;
    
    /// Compare two checkpoints (file-level)
    pub fn diff(&self, from_id: &str, to_id: &str) -> Result<CheckpointDiff>;
    
    /// Compare file with line-level differences
    pub fn diff_file(&self, from_id: &str, to_id: &str, path: &Path, options: DiffOptions) -> Result<FileDiff>;
    
    /// Get detailed diff with line-level changes for all files
    pub fn diff_detailed(&self, from_id: &str, to_id: &str, options: DiffOptions) -> Result<DetailedCheckpointDiff>;
    
    /// Garbage collect unreferenced objects
    pub fn gc(&self) -> Result<GcStats>;
    
    /// Verify checkpoint integrity
    pub fn verify_checkpoint(&self, checkpoint_id: &str) -> Result<VerificationReport>;
}

`CompressionStrategy`

Configurable compression strategies for different use cases.

pub enum CompressionStrategy {
    /// No compression - maximum speed
    None,
    
    /// LZ4 compression for all files (default)
    Fast,
    
    /// Adaptive compression based on file attributes
    Adaptive {
        min_size: usize,              // Skip files smaller than this
        skip_extensions: Vec<String>, // Skip these extensions
    },
    
    /// Custom compression predicate
    Custom(Arc<dyn Fn(&Path, usize) -> bool + Send + Sync>),
}

`Checkpoint`

Represents a point-in-time snapshot.

pub struct Checkpoint {
    pub id: String,                          // UUID v4 identifier
    pub parent_id: Option<String>,           // Parent checkpoint for timeline
    pub timestamp: DateTime<Utc>,            // Creation timestamp
    pub description: Option<String>,         // User description
    pub metadata: CheckpointMetadata,        // Size, file count, etc.
    pub state_hash: String,                  // SHA-256 of checkpoint state
    pub content_merkle_root: String,         // Merkle tree root hash
}

`DiffOptions`

Configure line-level diff generation.

pub struct DiffOptions {
    pub context_lines: usize,      // Lines of context (default: 3)
    pub ignore_whitespace: bool,   // Ignore whitespace changes
    pub show_line_numbers: bool,   // Include line numbers
    pub max_file_size: u64,       // Max file size for diffs (default: 10MB)
}

`FileDiff`

Line-level diff information for a single file.

pub struct FileDiff {
    pub path: PathBuf,
    pub is_binary: bool,
    pub hunks: Vec<DiffHunk>,     // Contiguous blocks of changes
    pub lines_added: usize,
    pub lines_deleted: usize,
}

Error Handling

Titor uses a comprehensive error type hierarchy:

#[derive(Debug, thiserror::Error)]
pub enum TitorError {
    #[error("IO error: {0}")]
    Io(#[from] std::io::Error),
    
    #[error("Checkpoint not found: {0}")]
    CheckpointNotFound(String),
    
    #[error("Storage corruption detected: {0}")]
    StorageCorruption(String),
    
    #[error("Checkpoint has children: {0}")]
    CheckpointHasChildren(String),
    
    // Additional error variants...
}

Performance

Optimization Strategies

Parallel File Scanning: Utilizes Rayon for concurrent directory traversal
Streaming Compression: Processes large files in chunks to minimize memory usage
Content Deduplication: Identical files stored only once across all checkpoints
Lazy Loading: Objects loaded from storage only when needed
Sharded Storage: Two-character prefix sharding prevents filesystem bottlenecks

Storage Format

Object Format

Each stored object consists of:

4-byte header indicating compression status
LZ4-compressed content (if compression applied)
Original content (if compression not beneficial)

Manifest Format

File manifests use Bincode serialization for efficiency:

pub struct FileManifest {
    pub checkpoint_id: String,
    pub files: Vec<FileEntry>,
    pub total_size: u64,
    pub file_count: usize,
    pub merkle_root: String,
    pub created_at: DateTime<Utc>,
}

Security

Cryptographic Verification

Titor implements multiple layers of integrity checking:

Content Hashing: SHA-256 hash for each file
Merkle Trees: Cryptographic proof of entire checkpoint state
State Hashing: Combined hash of all checkpoint components
Tamper Detection: Automatic corruption detection during verification

Verification API

let verifier = CheckpointVerifier::new(&storage);
let report = verifier.verify_complete(&checkpoint)?;

if !report.is_valid() {
    eprintln!("Verification failed: {}", report.summary());
    for error in &report.errors {
        eprintln!("  - {}", error);
    }
}

CLI Usage

Titor includes a comprehensive command-line interface.

Installation

# Install the titor CLI from crates.io
cargo install titor
# Or install from local path
cargo install --path .

Commands

# Initialize repository
titor init --compression adaptive

# Create checkpoint
titor checkpoint -m "Before refactoring"

# List checkpoints
titor list --detailed

# Restore to checkpoint
titor restore <checkpoint-id>

# Show timeline tree
titor timeline

# Compare checkpoints
titor diff <from-id> <to-id>

# Compare with line-level differences (git-like)
titor diff <from-id> <to-id> --lines

# Compare with custom context lines
titor diff <from-id> <to-id> --lines --context 5

# Show only statistics
titor diff <from-id> <to-id> --stat

# Ignore whitespace changes
titor diff <from-id> <to-id> --lines --ignore-whitespace

# Verify integrity
titor verify --all

# Garbage collection
titor gc --dry-run

Advanced Features

Timeline Branching

Create independent branches for experimentation:

// Fork from existing checkpoint
let fork = titor.fork(&checkpoint_id, Some("Experimental branch".to_string()))?;

// Work on branch...

// Later, restore to original
titor.restore(&checkpoint_id)?;

Auto-Checkpoint Strategies

Configure automatic checkpoint creation:

titor.set_auto_checkpoint(AutoCheckpointStrategy::Smart {
    min_files_changed: 10,
    min_size_changed: 1024 * 1024, // 1MB
    max_time_between: Duration::from_secs(3600), // 1 hour
});

Checkpoint Hooks

Implement custom logic for checkpoint events:

impl CheckpointHook for MyHook {
    fn pre_checkpoint(&self, stats: &ChangeStats) -> Result<()> {
        println!("Creating checkpoint with {} changes", stats.total_operations());
        Ok(())
    }
    
    fn post_checkpoint(&self, checkpoint: &Checkpoint) -> Result<()> {
        // Custom post-checkpoint logic
        Ok(())
    }
}

titor.add_hook(Box::new(MyHook));

Development

Building from Source

git clone https://github.com/mufeedvh/titor.git
cd titor
cargo build --release

Running Tests

# Unit and integration tests
cargo test

# Include ignored tests
cargo test -- --ignored

# Run with logging
RUST_LOG=debug cargo test

Project Structure

titor/
├── src/
│   ├── lib.rs              # Public API
│   ├── titor.rs           # Core implementation
│   ├── storage.rs          # Storage backend
│   ├── checkpoint.rs       # Checkpoint types
│   ├── timeline.rs         # Timeline management
│   ├── compression.rs      # Compression engine
│   ├── verification.rs     # Integrity checking
│   └── merkle.rs          # Merkle tree implementation
├── benches/               # Performance benchmarks
├── examples/              # Example implementations
└── tests/                # Integration tests

License

Licensed under the MIT License, see LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.assets		.assets
benches		benches
examples		examples
src		src
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

License

getAsterisk/titor

Folders and files

Latest commit

History

Repository files navigation

titor

Table of Contents

Features

Core Capabilities

Technical Specifications

Architecture

System Overview

Storage Layout

Object Storage

Installation

Dependencies

Quick Start

Basic Usage

Line-Level Diff Example

Advanced Configuration

API Documentation

Core Types

Titor

CompressionStrategy

Checkpoint

DiffOptions

FileDiff

Error Handling

Performance

Optimization Strategies

Storage Format

Object Format

Manifest Format

Security

Cryptographic Verification

Verification API

CLI Usage

Installation

Commands

Advanced Features

Timeline Branching

Auto-Checkpoint Strategies

Checkpoint Hooks

Development

Building from Source

Running Tests

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

`titor`

`Titor`

`CompressionStrategy`

`Checkpoint`

`DiffOptions`

`FileDiff`

Packages