Skip to content

Conversation

lrnv
Copy link

@lrnv lrnv commented Oct 4, 2025

Summary

This PR adds ChainRules coverage for the regularized incomplete beta function and its inverse:

  • Adds analytic frule/rrule for:
    • beta_inc(a, b, x) -> (p, q)
    • beta_inc(a, b, x, y) with y = 1 − x semantics
    • beta_inc_inv(a, b, p) -> (x, 1 − x)
  • Add a lot of tests in test/chainrules.jl.

I did that by translating the S+ algorithm of Boik & Robinson-Cox (1998) for the partial derivatives w.r.t. a, b, x.

Motivation

  • Allow AD to go through beta_inc and beta_inc_inv.
  • As hinted in beta_inv and beta_inv_inc gradients. #505, there a lot of places where these derivatives are missing:
    • Recently on this discourse thread
    • To fit a Distributions.MvTDist(), which was never fittable due to the lack of these derivatives.
    • In Copulas.jl, when using Distributions.Beta() marginals
    • In probably other issues that I do not know about.

I definitely think these derivatives belong to SpecialFunctions.jl's ChainRule's extension.

Final note

The reference of the algo is :

Boik, R. J., & Robinson-Cox, J. F. (1998). Derivatives of the incomplete beta function with respect to its parameters. Computational Statistics & Data Analysis, 27(1), 85–106.

Maybe we should have it as a proper reference in some documentation somewhere ? Dont know.

The algorithm is quite stable, and has good precision (see tests rtol). It works for Float64 and Float32. I do not think I broke anything so this should be patch-level change.

Waiting for review :)

Copy link

codecov bot commented Oct 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.83%. Comparing base (1743a8b) to head (6279c50).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
+ Coverage   94.03%   94.83%   +0.80%     
==========================================
  Files          14       14              
  Lines        2902     3178     +276     
==========================================
+ Hits         2729     3014     +285     
+ Misses        173      164       -9     
Flag Coverage Δ
unittests 94.83% <100.00%> (+0.80%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lrnv
Copy link
Author

lrnv commented Oct 5, 2025

@asinghvi17, @devmotion: I have switched implementations from the porting of the original code (which was probably not licensed correctly as we discussed yesterday) to a port of this repository, which is an independent implementation from @arzwa, unlicensed yet.

I have better hopes that @arzwa will allow us to use his code, and have emailed him. By looking at the diff you'll see that this is clearly an independent implementation, and thus, if @arzwa agrees, we are good license-wise.

I also have contacted the original authors, so if they come around faster we can still revert my last commit.

@arzwa
Copy link

arzwa commented Oct 5, 2025

I agree, you can use the code as you wish, I added an MIT license to the repo.

@lrnv
Copy link
Author

lrnv commented Oct 5, 2025

I knew it'll be faster 🤣

@arzwa
Copy link

arzwa commented Oct 5, 2025

@asinghvi17, @devmotion: I have switched implementations from the porting of the original code (which was probably not licensed correctly as we discussed yesterday) to a port of this repository, which is an independent implementation from @arzwa, unlicensed yet.

Now that I'm reading mentioned discussion, I would like to add that I definitely implemented this using the approach described in the mentioned paper (Boik & Robinson-Cox 1998). As far as I remember, I did not 'translate' the code but implemented the described numerical method 'independently', based on their mathematical description (in fact I'm pretty sure I have never seen S-plus code in my life). However, I don't know what that would mean license-wise and whether it's OK to use an MIT license (I have never had to think about those kind of issues before). Any insights on this?

@bdeonovic
Copy link

I would test a much larger range of a,b values. Look at the code for calculation of the incomplete beta in this repo. They use different algorithms depending on the domain.

Copy link
Member

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the nested functions, multiple assignments on a single line and generally lack of comments make it quite challenging to follow the code. Maybe the implementation should be cleaned up a bit before it should be considered for inclusion in SpecialFunctions.

@lrnv
Copy link
Author

lrnv commented Oct 13, 2025

So, I have :

  1. Moved the internal methods out of the function for readability
  2. Added a lot of comments to help following the code (also look at the paper that will help)
  3. Increased largely the coverage of (a,b,x) in my tests, by looking at what was done in test/beta_inc.jl, I think I hit most of the branches as @bdeonovic mentioned.
  4. Fixed the behavior by "merging" two methods (not conform to what @arzwa implemented previously) to avoid cancellations.

A slightly potentially-problematic outcome : the tests of the new chainrule alone last for ~40s on my machine, which is a large part of the total test time of the package :/

Ready for review round 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants