Online (incremental) covariance and correlation estimation — the streaming complement to
sklearn.covariance, whose estimators
are batch-only and have no partial_fit. Pure Python + numpy; no other required dependencies. And
because no estimator wins everywhere, precise also assesses an estimate and recommends one
for your data (see Assess & recommend).
📖 Docs: precise.microprediction.org
pip install precisefrom precise import EwaCovariance
est = EwaCovariance(r=0.05)
for y in stream: # y is a 1d observation; pass a 2d array for a batch
est.partial_fit(y)
est.covariance_ # (n, n) ndarray
est.correlation_ # unit-diagonal correlation
est.precision_ # inverse covariance
est.location_ # running mean
est.fit(X) # sklearn-style batch drop-in (X is 2d)Every estimator is truly online — a constant amount of work per observation, no growing
buffers. State is a plain dict, so you can checkpoint mid-stream with get_state() / set_state().
| Class | What it does |
|---|---|
EmpiricalCovariance |
running sample covariance (Welford) |
EwaCovariance |
exponentially weighted (recency-biased) |
AdaptiveEwaCovariance |
EWMA whose forgetting rate speeds up on regime change |
LedoitWolfCovariance |
online Ledoit-Wolf shrinkage towards a scaled identity |
OASCovariance |
online Oracle Approximating Shrinkage (often better-conditioned than LW) |
ShrunkCovariance |
fixed-intensity shrinkage to identity or a constant-correlation target |
PartialMomentsCovariance |
exponentially weighted partial-moment (semi-)covariance |
HuberCovariance |
online robust estimator that downweights outliers |
TylerCovariance |
recursive Tyler M-estimator — robust correlation/shape for elliptical data |
GeodesicEwaCovariance |
recency-weighted update along the affine-invariant SPD geodesic |
DCCCovariance |
dynamic conditional correlation — decouples volatility from correlation |
FactorCovariance |
online low-rank + diagonal (approximate factor model); O(d·k) per step |
from precise import all_estimators, estimator_from_name
all_estimators() # the list of classes (a bake-off in one loop)
estimator_from_name("LedoitWolfCovariance")In streaming/finance settings observations arrive as dicts keyed by name, and the set of names
can change over time. keyed(...) decorates any of the estimators above to consume keyed dicts
(river-style update / learn_one) and emit keyed output:
from precise import keyed, EwaCovariance
d = keyed(EwaCovariance(r=0.05), dynamic=True) # changing universe (DynamicUniverse)
d.update({"AAPL": 0.01, "MSFT": -0.02})
d.update({"MSFT": 0.00, "NVDA": 0.03}) # AAPL leaves, NVDA enters
d.covariance_ # dict-of-dicts over the live universe
d.to_frame() # pandas DataFrame (pip install precise[pandas])
k = keyed(EwaCovariance(r=0.05)) # fixed universe, imputes missing keys (FixedUniverse)dynamic=False (the default) gives a FixedUniverse (one wrapped estimator, missing keys imputed);
dynamic=True gives a DynamicUniverse (a wrapped estimator per live key-set). Both work with any
positional estimator — the adapter adds no covariance math of its own.
H = D R D is a composition, not a fixed algorithm. ConditionalCovariance lets you pick the
per-series volatility model and the correlation estimator independently — DCCCovariance is
just the EWMA/EWMA special case:
from precise import ConditionalCovariance, EwaCovariance, LedoitWolfCovariance
est = ConditionalCovariance(vol=EwaCovariance(r=0.02), # any estimator, used per series in 1-D
corr=LedoitWolfCovariance(r=0.05)) # correlation from any estimatorThe volatility model can also be any univariate model from
microprediction/skaters (Holt, Hosking, …) via
from_skater — precise doesn't depend on it; the adapter is duck-typed:
import skaters
from precise import ConditionalCovariance, from_skater
est = ConditionalCovariance(vol=from_skater(skaters.holt), corr=EwaCovariance(r=0.05))No estimator wins everywhere, so precise treats judging and choosing an estimate as
first-class alongside producing one.
from precise import all_assessors, suggest
all_assessors() # scoring rules: LogLikelihood, BlockPseudoLikelihood, SchurLikelihood,
# SteinLoss, FrobeniusToTruth, GMVVariance, ... (higher = better)
suggest(X, top=3) # recommend estimator classes from observable features of Xsuggest maps truth-free features of your data (p/n, effective rank, sphericity, condition number,
off-diagonal mass, excess kurtosis) to an estimator, via a frozen, numpy-only decision tree. The
Schur pseudo-likelihood — a one-parameter (γ) bridge between the full and block-diagonal
Gaussian likelihoods — is both an assessor here and the subject of a
working paper.
- Generating random covariance/correlation matrices to test against:
randomcov. - Portfolio construction (Schur-complementary allocation, HRP) moved to allocation.microprediction.org; for production use the skfolio implementation is recommended.
- A Robust Portfolio Literature Reading List lives in this repo.
- Part of the microprediction project.
Migrating from precise < 1.0 (the functional "skater" API)? See MIGRATING.md.
Not investment advice. Just code, subject to the MIT License.