Skip to content

AAGI-AUS/kernP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kernP

Lifecycle: experimental Python 3.10+

Kernel-Based Causal Distributional Testing for Python

kernP implements kernel-based hypothesis tests for causal inference and distributional treatment effects. It detects effects that standard methods (t-tests, linear regression, Double ML) miss entirely — including non-linear causal associations, variance shifts, and distributional changes that leave the mean unchanged.

Pure NumPy/SciPy implementation — no compiled extensions required.

Key Features

  • HSIC independence test — detects non-linear dependencies that Pearson correlation misses
  • MMD two-sample test — detects any distributional difference, not just mean shifts
  • Backdoor-adjusted HSIC (bd-HSIC) — tests causal association after adjusting for confounders (Hu, Sejdinovic & Evans, 2024)
  • Doubly robust distributional tests (DR-DATE, DR-DETT) — tests for distributional treatment effects with double robustness (Fawkes, Hu, Evans & Sejdinovic, 2024)
  • Hierarchical/nested data — within-cluster and between-cluster decomposition for clustered data
  • DataFrame interfacekernel_causal_test(df, y="y", treatment="t", covariates=["x1", "x2"])
  • scikit-learn integration — any classifier with predict_proba() for propensity estimation

Installation

pip install kernP

# With optional extras
pip install kernP[plot]      # matplotlib for result.plot()
pip install kernP[xgboost]   # XGBoost propensity models
pip install kernP[dev]       # pytest, ruff, mypy

Quick Start

import numpy as np
import kernp

# Detect non-linear dependence (HSIC)
rng = np.random.default_rng(42)
x = rng.normal(size=200)
y = x**2 + rng.normal(size=200, scale=0.3)  # quadratic relationship

from scipy.stats import pearsonr
pearsonr(x, y).pvalue       # Pearson: p = 0.85 (misses it!)
kernp.hsic_test(x, y, seed=1).p_value  # HSIC: p = 0.002 (detects it)

# Test for distributional treatment effect (DR-DATE)
n = 300
covariates = rng.normal(size=(n, 2))
treatment = rng.binomial(1, 0.5, size=n)
outcome = np.where(treatment, rng.normal(0, 2, n), rng.normal(0, 1, n)) + covariates[:, 0]

from scipy.stats import ttest_ind
ttest_ind(outcome[treatment == 1], outcome[treatment == 0]).pvalue  # p = 0.72
kernp.dr_date_test(outcome, treatment, covariates, seed=1).p_value  # p = 0.002

When to Use kernP

Your question Traditional method kernP function Why kernP wins
Are X and Y related? (possibly non-linearly) pearsonr() hsic_test() Detects U-shapes, periodicity, any non-linear pattern
Do two groups differ? (beyond means) ttest_ind() mmd_test() Detects variance, shape, and tail differences
Does X causally affect Y? (with confounders) Double ML bd_hsic_test() Non-parametric; no functional form assumptions
Does treatment change the outcome distribution? t-test / TMLE dr_date_test() Full distributional comparison; doubly robust
Effect on treated subgroup only? dr_dett_test() One-sided overlap; robust in imperfect settings
Clustered/hierarchical data? Mixed models hierarchical_test() Non-parametric + proper permutation within clusters

Methodological Foundation

kernP implements methods from two peer-reviewed papers:

  1. Hu, R., Sejdinovic, D., & Evans, R. J. (2024). A kernel test for causal association via noise contrastive backdoor adjustment. Journal of Machine Learning Research, 25(160), 1–56. Paper | Original code

  2. Fawkes, J., Hu, R., Evans, R. J., & Sejdinovic, D. (2024). Doubly robust kernel statistics for testing distributional treatment effects. Transactions on Machine Learning Research. Paper | Original code

The hierarchical extension for nested/clustered data is a novel contribution.

R Companion

An R implementation is available as kernR with an equivalent API built on RcppArmadillo.

Licence

MIT

About

Kernel-based causal distributional testing for Python (bd-HSIC, DR-DATE, DR-DETT)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages