
A high-performance checkpointing library for Rust that enables time-travel capabilities through directory snapshots. Titor provides efficient incremental backups with cryptographic verification, content deduplication, and parallel processing.
- Features
- Architecture
- Installation
- Quick Start
- API Documentation
- Performance
- Storage Format
- Security
- CLI Usage
- Advanced Features
- Development
- License
- Incremental Snapshots: Only changed files are stored between checkpoints, minimizing storage overhead
- Content-Addressable Storage: Automatic deduplication across all checkpoints using SHA-256 hashing
- LZ4 Compression: Extreme-speed compression (500+ MB/s) with adaptive strategies
- Parallel Processing: Multi-threaded file scanning and hashing leveraging all available CPU cores
- Merkle Tree Verification: Cryptographic integrity checking for tamper detection
- Timeline Branching: Git-like branching model supporting multiple independent timelines
- Atomic Operations: All checkpoint and restore operations are atomic with rollback on failure
- Memory Efficient: Streaming architecture for large files without full memory loading
- Line-Level Diffs: Git-like unified diff output showing exactly what changed between checkpoints
- Compression: LZ4 via
lz4_flex
crate with 4+ GB/s decompression speeds - Hashing: SHA-256 for content identification and verification
- Serialization: Bincode for efficient binary storage of metadata
- Concurrency: Rayon-based parallel processing with configurable worker pools
- Storage: Sharded object storage with reference counting for garbage collection
Titor implements a content-addressable storage system with the following components:
- Storage Layer: Manages compressed object storage with automatic deduplication
- Checkpoint System: Creates and manages independent snapshots with metadata
- Timeline Manager: Maintains DAG structure of checkpoints with branching support
- Verification Engine: Provides cryptographic integrity checking via Merkle trees
- Compression Engine: Adaptive LZ4 compression with configurable strategies
storage_root/
├── metadata.json # Storage configuration and version info
├── timeline.json # Timeline DAG structure
├── checkpoints/ # Checkpoint metadata directory
│ └── {checkpoint_id}/
│ ├── metadata.json # Checkpoint metadata (size, timestamps, etc.)
│ └── manifest.bin # Binary manifest of all files (bincode format)
├── objects/ # Content-addressable object storage
│ └── {prefix}/ # Two-character sharding for performance
│ └── {hash} # LZ4-compressed file content
└── refs/ # Reference counting for garbage collection
└── {object_hash} # Reference count per object
Files are stored using content-based addressing:
- SHA-256 hash computed for file content
- Objects sharded by first 2 hash characters for filesystem performance
- Reference counting enables safe garbage collection
- Compression applied based on configurable strategies
Add to your Cargo.toml
:
[dependencies]
titor = "0.2.0"
Or install the CLI tool with:
cargo install titor
Required system dependencies:
- Rust 1.70+ (for stable async traits)
- Platform-specific filesystem capabilities for symbolic links
use titor::{Titor, TitorBuilder, CompressionStrategy};
use std::path::PathBuf;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize with adaptive compression
let mut titor = TitorBuilder::new()
.compression_strategy(CompressionStrategy::Adaptive {
min_size: 4096,
skip_extensions: vec!["jpg".to_string(), "mp4".to_string()],
})
.parallel_workers(8)
.build(
PathBuf::from("/path/to/project"),
PathBuf::from("/path/to/project/.titor")
)?;
// Create checkpoint
let checkpoint = titor.checkpoint(Some("Initial state".to_string()))?;
println!("Created checkpoint: {}", checkpoint.id);
// Restore to checkpoint
titor.restore(&checkpoint.id)?;
Ok(())
}
use titor::types::DiffOptions;
// Get detailed diff with line-level changes
let options = DiffOptions {
context_lines: 3,
ignore_whitespace: false,
show_line_numbers: true,
max_file_size: 10 * 1024 * 1024, // 10MB
};
let detailed_diff = titor.diff_detailed(&checkpoint1.id, &checkpoint2.id, options)?;
// Display results
println!("Total lines added: {}", detailed_diff.total_lines_added);
println!("Total lines deleted: {}", detailed_diff.total_lines_deleted);
for file_diff in &detailed_diff.file_diffs {
println!("\nFile: {:?}", file_diff.path);
for hunk in &file_diff.hunks {
println!("@@ -{},{} +{},{} @@",
hunk.from_line, hunk.from_count,
hunk.to_line, hunk.to_count);
for change in &hunk.changes {
match change {
LineChange::Added(_, line) => println!("+{}", line),
LineChange::Deleted(_, line) => println!("-{}", line),
LineChange::Context(_, line) => println!(" {}", line),
}
}
}
}
let titor = TitorBuilder::new()
.compression_strategy(CompressionStrategy::Custom(Arc::new(|path, size| {
// Custom compression logic
size > 1024 && !path.extension().map_or(false, |ext| ext == "zip")
})))
.ignore_patterns(vec![
"*.tmp".to_string(),
"node_modules/**".to_string(),
])
.max_file_size(100 * 1024 * 1024) // 100MB limit
.follow_symlinks(false)
.parallel_workers(num_cpus::get())
.build(root_path, storage_path)?;
The main interface for checkpoint operations.
impl Titor {
/// Create a new checkpoint capturing current directory state
pub fn checkpoint(&mut self, description: Option<String>) -> Result<Checkpoint>;
/// Restore directory to a specific checkpoint state
pub fn restore(&mut self, checkpoint_id: &str) -> Result<RestoreResult>;
/// List all checkpoints in chronological order
pub fn list_checkpoints(&self) -> Result<Vec<Checkpoint>>;
/// Get timeline DAG structure
pub fn get_timeline(&self) -> Result<Timeline>;
/// Create a new branch from existing checkpoint
pub fn fork(&mut self, checkpoint_id: &str, description: Option<String>) -> Result<Checkpoint>;
/// Compare two checkpoints (file-level)
pub fn diff(&self, from_id: &str, to_id: &str) -> Result<CheckpointDiff>;
/// Compare file with line-level differences
pub fn diff_file(&self, from_id: &str, to_id: &str, path: &Path, options: DiffOptions) -> Result<FileDiff>;
/// Get detailed diff with line-level changes for all files
pub fn diff_detailed(&self, from_id: &str, to_id: &str, options: DiffOptions) -> Result<DetailedCheckpointDiff>;
/// Garbage collect unreferenced objects
pub fn gc(&self) -> Result<GcStats>;
/// Verify checkpoint integrity
pub fn verify_checkpoint(&self, checkpoint_id: &str) -> Result<VerificationReport>;
}
Configurable compression strategies for different use cases.
pub enum CompressionStrategy {
/// No compression - maximum speed
None,
/// LZ4 compression for all files (default)
Fast,
/// Adaptive compression based on file attributes
Adaptive {
min_size: usize, // Skip files smaller than this
skip_extensions: Vec<String>, // Skip these extensions
},
/// Custom compression predicate
Custom(Arc<dyn Fn(&Path, usize) -> bool + Send + Sync>),
}
Represents a point-in-time snapshot.
pub struct Checkpoint {
pub id: String, // UUID v4 identifier
pub parent_id: Option<String>, // Parent checkpoint for timeline
pub timestamp: DateTime<Utc>, // Creation timestamp
pub description: Option<String>, // User description
pub metadata: CheckpointMetadata, // Size, file count, etc.
pub state_hash: String, // SHA-256 of checkpoint state
pub content_merkle_root: String, // Merkle tree root hash
}
Configure line-level diff generation.
pub struct DiffOptions {
pub context_lines: usize, // Lines of context (default: 3)
pub ignore_whitespace: bool, // Ignore whitespace changes
pub show_line_numbers: bool, // Include line numbers
pub max_file_size: u64, // Max file size for diffs (default: 10MB)
}
Line-level diff information for a single file.
pub struct FileDiff {
pub path: PathBuf,
pub is_binary: bool,
pub hunks: Vec<DiffHunk>, // Contiguous blocks of changes
pub lines_added: usize,
pub lines_deleted: usize,
}
Titor uses a comprehensive error type hierarchy:
#[derive(Debug, thiserror::Error)]
pub enum TitorError {
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Checkpoint not found: {0}")]
CheckpointNotFound(String),
#[error("Storage corruption detected: {0}")]
StorageCorruption(String),
#[error("Checkpoint has children: {0}")]
CheckpointHasChildren(String),
// Additional error variants...
}
- Parallel File Scanning: Utilizes Rayon for concurrent directory traversal
- Streaming Compression: Processes large files in chunks to minimize memory usage
- Content Deduplication: Identical files stored only once across all checkpoints
- Lazy Loading: Objects loaded from storage only when needed
- Sharded Storage: Two-character prefix sharding prevents filesystem bottlenecks
Each stored object consists of:
- 4-byte header indicating compression status
- LZ4-compressed content (if compression applied)
- Original content (if compression not beneficial)
File manifests use Bincode serialization for efficiency:
pub struct FileManifest {
pub checkpoint_id: String,
pub files: Vec<FileEntry>,
pub total_size: u64,
pub file_count: usize,
pub merkle_root: String,
pub created_at: DateTime<Utc>,
}
Titor implements multiple layers of integrity checking:
- Content Hashing: SHA-256 hash for each file
- Merkle Trees: Cryptographic proof of entire checkpoint state
- State Hashing: Combined hash of all checkpoint components
- Tamper Detection: Automatic corruption detection during verification
let verifier = CheckpointVerifier::new(&storage);
let report = verifier.verify_complete(&checkpoint)?;
if !report.is_valid() {
eprintln!("Verification failed: {}", report.summary());
for error in &report.errors {
eprintln!(" - {}", error);
}
}
Titor includes a comprehensive command-line interface.
# Install the titor CLI from crates.io
cargo install titor
# Or install from local path
cargo install --path .
# Initialize repository
titor init --compression adaptive
# Create checkpoint
titor checkpoint -m "Before refactoring"
# List checkpoints
titor list --detailed
# Restore to checkpoint
titor restore <checkpoint-id>
# Show timeline tree
titor timeline
# Compare checkpoints
titor diff <from-id> <to-id>
# Compare with line-level differences (git-like)
titor diff <from-id> <to-id> --lines
# Compare with custom context lines
titor diff <from-id> <to-id> --lines --context 5
# Show only statistics
titor diff <from-id> <to-id> --stat
# Ignore whitespace changes
titor diff <from-id> <to-id> --lines --ignore-whitespace
# Verify integrity
titor verify --all
# Garbage collection
titor gc --dry-run
Create independent branches for experimentation:
// Fork from existing checkpoint
let fork = titor.fork(&checkpoint_id, Some("Experimental branch".to_string()))?;
// Work on branch...
// Later, restore to original
titor.restore(&checkpoint_id)?;
Configure automatic checkpoint creation:
titor.set_auto_checkpoint(AutoCheckpointStrategy::Smart {
min_files_changed: 10,
min_size_changed: 1024 * 1024, // 1MB
max_time_between: Duration::from_secs(3600), // 1 hour
});
Implement custom logic for checkpoint events:
impl CheckpointHook for MyHook {
fn pre_checkpoint(&self, stats: &ChangeStats) -> Result<()> {
println!("Creating checkpoint with {} changes", stats.total_operations());
Ok(())
}
fn post_checkpoint(&self, checkpoint: &Checkpoint) -> Result<()> {
// Custom post-checkpoint logic
Ok(())
}
}
titor.add_hook(Box::new(MyHook));
git clone https://github.com/mufeedvh/titor.git
cd titor
cargo build --release
# Unit and integration tests
cargo test
# Include ignored tests
cargo test -- --ignored
# Run with logging
RUST_LOG=debug cargo test
titor/
├── src/
│ ├── lib.rs # Public API
│ ├── titor.rs # Core implementation
│ ├── storage.rs # Storage backend
│ ├── checkpoint.rs # Checkpoint types
│ ├── timeline.rs # Timeline management
│ ├── compression.rs # Compression engine
│ ├── verification.rs # Integrity checking
│ └── merkle.rs # Merkle tree implementation
├── benches/ # Performance benchmarks
├── examples/ # Example implementations
└── tests/ # Integration tests
Licensed under the MIT License, see LICENSE for more information.