Skip to content

modified stats exponential distribution procedures to use loc and scale #991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

sebastian-mutz
Copy link

There’s some inconsistency in passed arguments for normal and exponential distributions. Consider the pdfs:
result = pdf_normal(x, loc, scale)
result = pdf_exp(x, lambda)

While the normal distribution procedure uses loc and scale (reminiscent of scipy) as opposed to mu and sigma, the exponential distribution procedure uses lambda rather than scale (also often referred to as beta, which is 1/lambda) and is missing loc.

As discussed on fortran-lang discourse, I suggest introducing more consistency in the stats distribution procedures API. I have made some modifications to stdlib_stats_distribution_exponential (and associated examples and tests) to:

  1. use scale instead of lambda, and
  2. allow passing an additional location parameter loc.

With the suggested modifications in the PR, rvs, pdf and cdf procedures can now be called as such:
result = rvs_exp(x, loc, scale),
result = pdf_exp(x, loc, scale), and
result = cdf_exp(x, loc, scale).

@jalvesz
Copy link
Contributor

jalvesz commented May 10, 2025

Thanks for this PR @sebastian-mutz! code wise it looks good to me. Since an API change is being proposed this is a breaking change that might impact some users. I would suggest just to link back on discourse with the summary of the proposal here to alert possible users.

@jalvesz jalvesz requested review from jvdp1 and Beliavsky May 10, 2025 15:35
Copy link
Member

@perazz perazz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR @sebastian-mutz. LGTM, I only have a couple comments.

  • Regarding the API, I would be in favor of adding the new one as an interface rather than replacing the old. This would be in favor of backward compatibility and user flexibility. The former one could be removed in a next version, if deemed obsolete.

  • Would you mind to also update the specs file for the exponential? it is at

https://github.com/fortran-lang/stdlib/blob/master/doc/specs/stdlib_stats_distribution_exponential.md

@sebastian-mutz
Copy link
Author

sebastian-mutz commented May 10, 2025

Thanks, @perazz.

Regarding the API, I would be in favor of adding the new one as an interface rather than replacing the old. This would be in favor of backward compatibility and user flexibility.

Keeping this flexibility makes sense. Since the different parameterisation requires (slightly) different calculations, it may not be as simple as adding the new ones as an interface (or perhaps I misunderstand your suggestion?). I could, however, a) rename the new procedures and create wrapper functions to keep the old API, and b) additionally create a generic interface for both.
EDIT: I should just be able to append the procedures + wrappers to the existing unifying interface. Since they take different numbers of arguments, they are distinguishable.

Would you mind to also update the specs file for the exponential?

I missed this. Thanks. I'll update it.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some comments.

  • I suggest to keep the older API, if possible and if everybody agrees on it.
  • The specs should be updated.
  • The tests (at least the true values) should not be modified (except if they are wrong). If new true values are needed, new tests should be added.

0.371111977992924465160247867364281152_${k1}$, &
0.811918715695663687862785688290993341_${k1}$, &
0.404637854946697868759504975362991277_${k1}$]
[ 1.37178108290154243675829093263018876_${k1}$, &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, these values should not be modified. Only the calls should be modified and return the same values as the previous versions.

res = 0._${k1}$
else
res = 1.0_${k1}$ - exp(- x * lambda)
res = 1.0_sp - exp(- (x - loc) / scale)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
res = 1.0_sp - exp(- (x - loc) / scale)
res = 1.0_sp - exp(- (x - loc) / scale)
Suggested change
res = 1.0_sp - exp(- (x - loc) / scale)
res = 1.0_${k1}$ - exp(- (x - loc) / scale)

Comment on lines 132 to 135
res(i) = expon_rvs(loc, scale) ! 1 dummy
end do
res(6:10) = expon_rvs(scale, k) ! 2 dummies
res(6:10) = expon_rvs(loc, scale, k) ! 2 dummies
call check(all(abs(res - ans) < ${k1}$tol), &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a new API (and new tests) and keep this one (at least until a next release)?

The API could be something like:

function expon_rvs(scale) result(res)
  ...
  res = expon_rrvs(loc = 0, scale = 1/scale,...)
end function

Copy link
Author

@sebastian-mutz sebastian-mutz May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and comments, @jvdp1. If I understand correctly, you suggest renaming the updated procedures and creating a wrapper to keep the old API in tact for now. So in your example, scale in expon_rvs(scale) would actually be lambda.

@sebastian-mutz
Copy link
Author

sebastian-mutz commented May 13, 2025

I've updated the PR with a new commit. Based on your comments, I've changed the following:

Exponential Distribution Procedures

  • I created wrapper functions that take lambda and pass loc=0.0_kind and scale=1.0_kind/lambda to the updated functions.
  • The wrapper functions' names simply include a lamba, so the equivalent of rvs_exp_array_typeKind is rvs_exp_array_lambda_typeKind.
  • The wrapper functions are added to the unified interfaces. This is possible, because all procedures remain distinguishable by the passed arguments (either the number or type/kind).

Examples

  • Examples have been updated to demonstrate the use of the old (lambda based) and new (loc and scale based) interfaces through the unified interfaces. E.g., exp_pdf(1.0, 1.0) uses lambda(=1.0) , while exp_pdf(1.0, 0.0, 1.0) uses loc(=0.0) and scale(=1.0) and they both yield 0.367879450.

Tests

  • The pre-defined test answers from the previous version have been restored.
  • Tests for the new interfaces are included, and test results checked against the same pre-defined answers. (All tests pass).

Specs

  • The documentation md file has been updated. It now introduces both ways of using the unified interfaces. The 2 different ways of using the unified interface are clearly separated in the syntax section.

Copy link
Member

@perazz perazz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the attached edits, thank you @sebastian-mutz! Let us wait for @jvdp1's review as well.

@sebastian-mutz
Copy link
Author

Thanks for commiting the specs update already, @perazz.

@perazz
Copy link
Member

perazz commented May 17, 2025

Yes, let's wait another couple of days for additional comments, and if not, this is ready to merge imho.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @sebastian-mutz

I wonder if the old API should still be mentioned in the specs. The wrappers are importatn for backward compatibility,but I think that they should not be mentioned in the specs to avoid that people use them. Or a note mentioning that they are deprecated should be at least added in the specs. @perazz what do you think about that?

@@ -31,6 +36,10 @@ The algorithm used for generating exponential random variates is fundamentally l

`result = ` [[stdlib_stats_distribution_exponential(module):rvs_exp(interface)]] `([lambda] [[, array_size]])`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to remove this one from the specs or to mention that it is deprecated.

@@ -69,10 +86,16 @@ For a complex variable \(z=(x + y i)\) with independent real \(x\) and imaginary

$$f(x+\mathit{i}y)=f(x)f(y)=\begin{cases} \lambda_{x} \lambda_{y} e^{-(\lambda_{x} x + \lambda_{y} y)} &x\geqslant 0, y\geqslant 0 \\\\ 0 &\text{otherwise}\end{cases}$$

Instead of of the inverse scale parameter `lambda`, it is possible to pass `loc` and `scale`, where \(scale = \frac{1}{\lambda}\) and `loc` specifies the value by which the distribution is shifted.

### Syntax

`result = ` [[stdlib_stats_distribution_exponential(module):pdf_exp(interface)]] `(x, lambda)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: remove or mention it is deprecated.

Comment on lines +15 to 17
! standard exponential cumulative distribution at x=1.0 with lambda=1.0
print *, exp_cdf(1.0, 1.0)
! 0.632120550
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the example, I suggest to remove the old API.

@perazz
Copy link
Member

perazz commented May 17, 2025

In fine either way @jvdp1: my personal preference would be to keep offering both versions, I don’t see a problem having a specialized option (lambda only) that complements the shared API. But if that will be gradually deprecated, then I agree with you and it should not be in the top level specs anymore

@sebastian-mutz
Copy link
Author

sebastian-mutz commented May 17, 2025

Thanks, @jvdp1 and @perazz.

I don’t see a problem having a specialized option (lambda only) that complements the shared API

Me neither (but happy to change this if decided). The only problem I can think of is if any arguments are made optional at some point (so the module procedures could become indistinguishable). However, at this point, it'd be another API change requiring more attention and changes in documentation anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants