Skip to content

ucbrise/flor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlorDB: Log-First Context Management for ML Practitioners

PyPI

FlorDB brings experiment tracking, provenance, and reproducibility to your ML workflow—using the one thing every engineer already writes: logs.

Unlike heavyweight MLOps platforms, FlorDB doesn’t ask you to adopt a new UI, schema, or service. Just import it, log as you normally would, and gain full history, lineage, and replay capabilities across your training runs.

🚀 Why FlorDB?

  • Log-Driven Experiment Tracking
    No dashboards to configure or schemas to design. FlorDB turns your existing print() or log() calls into structured, queryable metadata.

  • Hindsight Logging & Replay
    Missed a metric? Add a log after the fact and replay past runs to capture it—no rerunning from scratch.

  • Reproducibility Without Friction
    Every run is versioned via Git, every hyperparameter is recorded, and every model checkpoint is linked and queryable—automatically.

  • Works With Your Stack
    Makefiles, Airflow, Slurm, HuggingFace, PyTorch—you don’t change your workflow. FlorDB fits in.

📦 Installation

pip install flordb

For contributors or bleeding-edge features:

git clone https://github.com/ucbrise/flor.git
cd flor
pip install -e .

📝 First Log in 30 Seconds

Requires a Git repository for automatic versioning.

mkdir flor_sandbox
cd flor_sandbox
git init
ipython
import flordb as flor
flor.log("message", "Hello ML World!")
message: Hello, ML World!
Changes committed successfully

Retrieve logs anytime:

flor.dataframe("message")
         projid              tstamp filename          message
0  flor_sandbox 2025-10-13 18:13:48  ipython  Hello ML World!

🧪 Track Experiments with Zero Overhead

Drop FlorDB into your existing training script:

import flordb as flor

# Hyperparameters
lr = flor.arg("lr", 1e-3)
batch_size = flor.arg("batch_size", 32)

with flor.checkpointing(model=net, optimizer=optimizer):
    for epoch in flor.loop("epoch", range(epochs)):
        for x, y in flor.loop("step", trainloader):
            ...
            flor.log("loss", loss.item())

Change hyperparameters from the CLI:

python train.py --kwargs lr=5e-4 batch_size=64

View metrics across runs:

flor.dataframe("lr", "batch_size", "loss")
        projid              tstamp  filename  epoch  step      lr batch_size                 loss
0  ml_tutorial 2025-10-13 18:18:14  train.py      1   500  0.0005         64  0.20570574700832367
1  ml_tutorial 2025-10-13 18:18:14  train.py      2   500  0.0005         64   0.1964433193206787
2  ml_tutorial 2025-10-13 18:18:14  train.py      3   500  0.0005         64  0.11040152609348297
3  ml_tutorial 2025-10-13 18:18:14  train.py      4   500  0.0005         64    0.155434250831604
4  ml_tutorial 2025-10-13 18:18:14  train.py      5   500  0.0005         64   0.0741351768374443

🔍 Hindsight Logging: Fix It After You See It

Forgot to log gradient norms?

flor.log("grad_norm", ...)

Just add the logging statement to the script and run:

python -m flordb replay grad_norm

FlorDB replays only what’s needed, injecting the new log across copies of historical versions and committing results.

🏗 Real ML Systems Built on FlorDB

FlorDB powers full AI/ML lifecycle tooling:

  • Feature Stores & Model Registries
  • Document Parsing & Feedback Loops
  • Continuous Training Pipelines

See our Scan Studio and Document Parser examples for real-world integration.

📚 Publications

FlorDB is based on research from UC Berkeley’s RISE Lab and Arizona State University.

  • Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle (CIDR 2025)
  • The Management of Context in the ML Lifecycle (UCB Tech Report 2024)
  • Hindsight Logging for Model Training (PVLDB 2021)

🛠 License

Apache v2 License — free to use, modify, and distribute.

💡 Get Involved

FlorDB is actively developed. Contributions, issues, and real-world use cases are welcome!

GitHub: https://github.com/ucbrise/flor
Tutorial Video: https://youtu.be/mKENSkk3S4Y