This package allows to generate the pseudo-random numbers from a given sample with a discrete probability distribution.
It makes use of a discrete sampling method. The algorithm is the following:
- read and validate the input numbers and their probabilities
- generate the discrete cumulative distribution, i.e. divide the [0,1] range according to the given probabilities into buckets
- generate the pseudo-random number x from the uniform distribution in [0,1)
- loop over the input sample and cumulative distribution, comparing the generated number x with the current value from the cumulative distribution
- return the number from the input when the generated number x is smaller than the value from the cumulative distribution, i.e. check which bucket x falls into
Essentially, the functionality of this package is equivalent to:
import random
random.choices(random_nums, weights=probabilities)[0]
- First, install the package:
python -m venv .venv
. .venv/bin/activate
python -m pip install .
- To generate the single number:
>>> from randomgen import RandomGen
>>> generator = {-1: 0.01, 0: 0.3, 1: 0.58, 2: 0.1, 3: 0.01}
>>> RandomGen.init_generator(generator)
>>> RandomGen().next_num()
1 # example!
- To generate a larger sample, run e.g.:
>>> results = []
>>> sample = 1_000
>>> for _ in range(sample):
... results.append(RandomGen().next_num())
...
- Then, you can check the results:
>>> from collections import Counter
>>> experiment = {num: count/sample for num, count in Counter(results).items()}
>>> compare = {num: (theory, experiment[num]) for num, theory in generator.items()}
>>> from pprint import pprint
>>> pprint(compare) # just an example!
{-1: [0.01, 0.015],
0: [0.3, 0.313],
1: [0.58, 0.582],
2: [0.1, 0.085],
3: [0.01, 0.005]}
This code makes use of Poetry to build, test, and package the project. To install the project run (remember to deactivate
previously activated virtual environment):
poetry install
To run the tests:
poetry run pytest .
Warning! One of the tests may fail due to the nature of probability!
The code is formatted using black. Imports are sorted with isort. The code is linted with mypy. All those tools run with pre-commit.