FlorDB: Log-First Context Management for ML Practitioners

FlorDB brings experiment tracking, provenance, and reproducibility to your ML workflow—using the one thing every engineer already writes: logs.

Unlike heavyweight MLOps platforms, FlorDB doesn’t ask you to adopt a new UI, schema, or service. Just import it, log as you normally would, and gain full history, lineage, and replay capabilities across your training runs.

🚀 Why FlorDB?

Log-Driven Experiment Tracking
No dashboards to configure or schemas to design. FlorDB turns your existing print() or log() calls into structured, queryable metadata.
Hindsight Logging & Replay
Missed a metric? Add a log after the fact and replay past runs to capture it—no rerunning from scratch.
Reproducibility Without Friction
Every run is versioned via Git, every hyperparameter is recorded, and every model checkpoint is linked and queryable—automatically.
Works With Your Stack
Makefiles, Airflow, Slurm, HuggingFace, PyTorch—you don’t change your workflow. FlorDB fits in.

📦 Installation

pip install flordb

For contributors or bleeding-edge features:

git clone https://github.com/ucbrise/flor.git
cd flor
pip install -e .

📝 First Log in 30 Seconds

Requires a Git repository for automatic versioning.

mkdir flor_sandbox
cd flor_sandbox
git init
ipython

import flordb as flor
flor.log("message", "Hello ML World!")

message: Hello, ML World!
Changes committed successfully

Retrieve logs anytime:

flor.dataframe("message")

         projid              tstamp filename          message
0  flor_sandbox 2025-10-13 18:13:48  ipython  Hello ML World!

🧪 Track Experiments with Zero Overhead

Drop FlorDB into your existing training script:

import flordb as flor

# Hyperparameters
lr = flor.arg("lr", 1e-3)
batch_size = flor.arg("batch_size", 32)

with flor.checkpointing(model=net, optimizer=optimizer):
    for epoch in flor.loop("epoch", range(epochs)):
        for x, y in flor.loop("step", trainloader):
            ...
            flor.log("loss", loss.item())

Change hyperparameters from the CLI:

python train.py --kwargs lr=5e-4 batch_size=64

View metrics across runs:

flor.dataframe("lr", "batch_size", "loss")

        projid              tstamp  filename  epoch  step      lr batch_size                 loss
0  ml_tutorial 2025-10-13 18:18:14  train.py      1   500  0.0005         64  0.20570574700832367
1  ml_tutorial 2025-10-13 18:18:14  train.py      2   500  0.0005         64   0.1964433193206787
2  ml_tutorial 2025-10-13 18:18:14  train.py      3   500  0.0005         64  0.11040152609348297
3  ml_tutorial 2025-10-13 18:18:14  train.py      4   500  0.0005         64    0.155434250831604
4  ml_tutorial 2025-10-13 18:18:14  train.py      5   500  0.0005         64   0.0741351768374443

🔍 Hindsight Logging: Fix It After You See It

Forgot to log gradient norms?

flor.log("grad_norm", ...)

Just add the logging statement to the script and run:

python -m flordb replay grad_norm

FlorDB replays only what’s needed, injecting the new log across copies of historical versions and committing results.

🏗 Real ML Systems Built on FlorDB

FlorDB powers full AI/ML lifecycle tooling:

Feature Stores & Model Registries
Document Parsing & Feedback Loops
Continuous Training Pipelines

See our Scan Studio and Document Parser examples for real-world integration.

📚 Publications

FlorDB is based on research from UC Berkeley’s RISE Lab and Arizona State University.

Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle (CIDR 2025)
The Management of Context in the ML Lifecycle (UCB Tech Report 2024)
Hindsight Logging for Model Training (PVLDB 2021)

🛠 License

Apache v2 License — free to use, modify, and distribute.

💡 Get Involved

FlorDB is actively developed. Contributions, issues, and real-world use cases are welcome!

GitHub: https://github.com/ucbrise/flor
Tutorial Video: https://youtu.be/mKENSkk3S4Y

Name		Name	Last commit message	Last commit date
Latest commit History 1,387 Commits
flordb		flordb
img		img
notebooks		notebooks
.gitignore		.gitignore
CIDR.md		CIDR.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlorDB: Log-First Context Management for ML Practitioners

🚀 Why FlorDB?

📦 Installation

📝 First Log in 30 Seconds

🧪 Track Experiments with Zero Overhead

🔍 Hindsight Logging: Fix It After You See It

🏗 Real ML Systems Built on FlorDB

📚 Publications

🛠 License

💡 Get Involved

About

Uh oh!

Packages

Uh oh!

Contributors 9

Uh oh!

Languages

License

ucbrise/flor

Folders and files

Latest commit

History

Repository files navigation

FlorDB: Log-First Context Management for ML Practitioners

🚀 Why FlorDB?

📦 Installation

📝 First Log in 30 Seconds

🧪 Track Experiments with Zero Overhead

🔍 Hindsight Logging: Fix It After You See It

🏗 Real ML Systems Built on FlorDB

📚 Publications

🛠 License

💡 Get Involved

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors 9

Uh oh!

Languages

Packages