Add knowledgebase guide for custom PyMC models#741
Add knowledgebase guide for custom PyMC models#741drbenvincent wants to merge 3 commits intomainfrom
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #741 +/- ##
==========================================
+ Coverage 94.60% 94.63% +0.02%
==========================================
Files 46 46
Lines 7602 7602
Branches 462 462
==========================================
+ Hits 7192 7194 +2
+ Misses 249 248 -1
+ Partials 161 160 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Documentation build overview
Show files changed (3 files in total): 📝 2 modified | ➕ 1 added | ➖ 0 deleted
|
🤖 Automated Agent Review
Overall AssessmentThis is a well-structured documentation PR that fills a genuine gap. The I verified the contract table against the source in Issues and Suggestions1. Missing mention of The base class provides a 2. The issue's concern about model misuse is unaddressed Issue #740 explicitly warns: "People who reach with this concern usually are doing very crazy wrong models... we'll open the door to do wrong models and because output looks nice, they might think [they] are correct." The notebook has no warning about model validation. Consider adding a 3. Notebook outputs bloat the repo (~845 KB) The diff is ~845 KB, almost entirely from embedded plot outputs (~98K + ~342K + ~404K chars). Check whether the project convention is to strip outputs from knowledgebase notebooks (e.g., via 4. No minimal The two MCMC cells use default 5. Contract table scope is unclear The contract applies to models used with the standard experiment workflow ( 6. No glossary term linking Per project conventions, glossary terms should be linked on first mention using 7. Both 8. Does this fully close #740? The PR documents existing capability (subclassing already works). The issue also envisions a simpler interface and mentions PyMC-Marketing integration. Documenting the current approach is a valid step, but "Closes #740" may be premature — consider "Partially addresses #740" or keeping the issue open for API-level improvements. Minor / Cosmetic
Summary Table
The core content is valuable and mostly ready. Main action items before merge: (1) add a model validation warning, (2) address notebook output size, (3) mention |
Document the two levels of model customization in CausalPy: 1. Prior-based (changing priors and likelihood via the Prior class) 2. Subclassing PyMCModel (for structural changes like link functions) Includes the PyMCModel naming contract, a worked LogLinearRegression example, and an ITS demonstration with simulated revenue data. Closes #740 Made-with: Cursor
Replace the revenue trend data with a sine-wave DGP where values oscillate close to zero at seasonal troughs. Add a side-by-side comparison with the default LinearRegression to show that its HDI bands extend into negative territory — the core motivation for the log-link model. Made-with: Cursor
Bump log-scale amplitude from 1.5 to 2.2 and add a -0.3 baseline shift. Trough values now reach ~0.07, making the linear model's negative HDI bands visually obvious. Made-with: Cursor
642cced to
4ff296a
Compare
Summary
Closes #740
docs/source/knowledgebase/custom_pymc_models.ipynb) documenting how to customize PyMC models in CausalPyPriorclass. References the existing Student-t example inits_pymc_comparative.ipynb.PyMCModel— for structural changes like link functions, latent variables, or custom error structureLogLinearRegressionexample (mu = exp(X @ beta)) for positive-valued outcomes, demonstrating the subclassing pattern with an ITS analysis on simulated revenue dataRough edges noted for follow-up
While researching this, I identified some rough edges that could be addressed in future PRs:
"X","y","mu","y_hat","beta") are hardcoded strings — could be class attributes to make them overridablebuild_model()— violating the naming contract produces crypticKeyErrors rather than helpful messagesprint_coefficients()assumes the model has"beta"— custom models with different parameterizations silently breakTest plan
pre-commit run --all-filespassesMade with Cursor