Skip to content

Commit

Permalink
Review of codebase and docs - Probabilities and Encodings- Datseris (#…
Browse files Browse the repository at this point in the history
…213)

* update probabilities table

* CountOccurrences works with `Any` input

* better terminology header

* simpler headers in probabilities

* Add encodings page

* simplify SymbolicPermutation docstring

* reference complexity measures

* correct dosctring to reference isrand

* more organized tests for symbolic permutat

* full rewrite of `SymbolicPermutation` and proper `encode` for Ordinal.

* type optimization in making the embedding

* remove entropy!

* simplifi probabilities! even more

* move fasthist to encoding folder

* complete unification of symbolic perm methods

* docstring for weighted fversion

* add docstring to amplkitude aware

* delete ALL other files

* fix all symbolic permutation tests

* fix all permutation tests (and one file only)

* clarify source code of encode Gaussian

* better docstring for GaussEncod

* simplify docstring of Dispersion

* more tests for naivekernel

* Zhu -> Correa

* shorter docstring for spatial permutation

* port spatial permutation example to Examples

* re-write SpatialSymb to have encoding as field. All tests pass.

* better display of exampels in decode

* better doc for ordinal encoding

* Some typos/nitpickery

* Probabilities can't compute. 

Computations are done with probabilities as *input*

* Don't duplicate `SpatialDispersion`

* Clarify docstrings a bit

* Typo

* Cross-reference spatial estimators

Co-authored-by: Kristian Haaga <[email protected]>
  • Loading branch information
Datseris and kahaaga authored Dec 25, 2022
1 parent 3de9294 commit cbb3319
Show file tree
Hide file tree
Showing 37 changed files with 725 additions and 1,033 deletions.
1 change: 0 additions & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
ChaosTools = "608a59af-f2a3-5ad4-90b4-758bdf3122a7"
CoordinateTransformations = "150eb455-5306-5404-9cee-2592286d6298"
DelayEmbeddings = "5732040d-69e3-5649-938a-b6b4f237613f"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterTools = "35a29f4d-8980-5a13-9543-d66fff28ecb8"
Expand Down
3 changes: 2 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ cd(@__DIR__)
using Pkg
CI = get(ENV, "CI", nothing) == "true" || get(ENV, "GITHUB_TOKEN", nothing) !== nothing
using Entropies
using DelayEmbeddings
using Documenter
using DocumenterTools: Themes
using CairoMakie
using Entropies.DelayEmbeddings
import Entropies.Wavelets

# %% JuliaDynamics theme
Expand Down Expand Up @@ -35,6 +35,7 @@ ENV["JULIA_DEBUG"] = "Documenter"
ENTROPIES_PAGES = [
"index.md",
"probabilities.md",
"encodings.md",
"entropies.md",
"complexity.md",
"multiscale.md",
Expand Down
2 changes: 1 addition & 1 deletion docs/src/devdocs.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Good practices in developing a code base apply in every Pull Request. The [Good
5. If suitable, the estimator may be able to operate based on [`Encoding`]s. If so, it is preferred to implement an `Encoding` subtype and extend the methods [`encode`](@ref) and [`decode`](@ref). This will allow your probabilities estimator to be used with a larger span of entropy and complexity methods without additional effort.
6. Implement dispatch for [`probabilities_and_outcomes`](@ref) and your probabilities estimator type.
7. Implement dispatch for [`outcome_space`](@ref) and your probabilities estimator type.
8. Add your probabilities estimator type to the list in the docstring of [`ProbabilitiyEstimator`](@ref), and if you also made an encoding, add it to the [`Encoding`](@ref) docstring.
8. Add your probabilities estimator type to the table list in the documentation page of probabilities. If you made an encoding, also add it to corresponding table in the encodings section.

### Optional steps
You may extend any of the following functions if there are potential performance benefits in doing so:
Expand Down
20 changes: 20 additions & 0 deletions docs/src/encodings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Encodings

## Encoding API

Some probability estimators first "encode" input data into an intermediate representation indexed by the positive integers. This intermediate representation is called an "encoding" and its API is defined by the following:

```@docs
Encoding
encode
decode
```

## Available encodings

```@docs
OrdinalPatternEncoding
GaussianCDFEncoding
RectangularBinEncoding
```

42 changes: 36 additions & 6 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ fig

### Kaniadakis entropy

Here, we show how [`Kaniadakis`](@ref) entropy changes as function of the parameter `a` for
Here, we show how [`Kaniadakis`](@ref) entropy changes as function of the parameter `a` for
a range of two-element probability distributions given by
`Probabilities([p, 1 - p] for p in 1:0.0:0.01:1.0)`.

Expand Down Expand Up @@ -370,11 +370,41 @@ end
You see that while the direct entropy values of the chaotic and noisy signals change massively with `N` but they are almost the same for the normalized version.
For the regular signals, the entropy decreases nevertheless because the noise contribution of the Fourier computation becomes less significant.

## Spatiotemporal permutation entropy

Usage of a [``SpatialSymbolicPermutation`](@ref) estimator is straightforward.
Here we get the spatial permutation entropy of a 2D array (e.g., an image):

```@example MAIN
using Entropies
x = rand(50, 50) # some image
stencil = [1 1; 0 1] # or one of the other ways of specifying stencils
est = SpatialSymbolicPermutation(stencil, x)
h = entropy(est, x)
```

To apply this to timeseries of spatial data, simply loop over the call, e.g.:

```@example MAIN
data = [rand(50, 50) for i in 1:10] # e.g., evolution of a 2D field of a PDE
est = SpatialSymbolicPermutation(stencil, first(data))
h_vs_t = map(d -> entropy(est, d), data)
```

Computing any other generalized spatiotemporal permutation entropy is trivial, e.g. with [`Renyi`](@ref):

```@example MAIN
x = reshape(repeat(1:5, 500) .+ 0.1*rand(500*5), 50, 50)
est = SpatialSymbolicPermutation(stencil, x)
entropy(Renyi(q = 2), est, x)
```


## Spatial discrete entropy: Fabio

Let's see how the normalized permutation and dispersion entropies increase for an image that gets progressively more noise added to it.

```@example
```@example MAIN
using Entropies
using Distributions
using CairoMakie
Expand All @@ -386,11 +416,11 @@ rot = warp(img, recenter(RotMatrix(-3pi/2), center(img));)
original = Float32.(rot)
noise_levels = collect(0.0:0.25:1.0) .* std(original) * 5 # % of 1 standard deviation
noisy_imgs = [i == 1 ? original : original .+ rand(Uniform(0, nL), size(original))
noisy_imgs = [i == 1 ? original : original .+ rand(Uniform(0, nL), size(original))
for (i, nL) in enumerate(noise_levels)]
# a 2x2 stencil (i.e. dispersion/permutation patterns of length 4)
stencil = ((2, 2), (1, 1))
stencil = ((2, 2), (1, 1))
est_disp = SpatialDispersion(stencil, original; c = 5, periodic = false)
est_perm = SpatialSymbolicPermutation(stencil, original; periodic = false)
Expand All @@ -399,8 +429,8 @@ hs_perm = [entropy_normalized(est_perm, img) for img in noisy_imgs]
# Plot the results
fig = Figure(size = (800, 1000))
ax = Axis(fig[1, 1:length(noise_levels)],
xlabel = "Noise level",
ax = Axis(fig[1, 1:length(noise_levels)],
xlabel = "Noise level",
ylabel = "Normalized entropy")
scatterlines!(ax, noise_levels, hs_disp, label = "Dispersion")
scatterlines!(ax, noise_levels, hs_perm, label = "Permutation")
Expand Down
35 changes: 19 additions & 16 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,22 @@ Entropies
You are reading the development version of the documentation of Entropies.jl,
that will become version 2.0.

## API & terminology
## Terminology

!!! note
The documentation here follows (loosely) chapter 5 of
[Nonlinear Dynamics](https://link.springer.com/book/10.1007/978-3-030-91032-7),
Datseris & Parlitz, Springer 2022.

In the literature, the term "entropy" is used (and abused) in multiple contexts.
The API and documentation of Entropies.jl aim to clarify some aspects of its usage, and
to provide a simple way to obtain probabilities, entropies, or other complexity measures.
The API and documentation of Entropies.jl aim to clarify some aspects of its usage, and to provide a simple way to obtain probabilities, entropies, or other complexity measures.

### Probabilities

Entropies and other complexity measures are typically computed based on _probability distributions_.
These are obtained from [Input data for Entropies.jl](@ref) in a plethora of different ways.
The central API function that returns a probability distribution (in fact, just a vector of probabilities) is [`probabilities`](@ref), which takes in a subtype of [`ProbabilitiesEstimator`](@ref) to specify how the probabilities are computed.
All estimators available in Entropies.jl can be found in the [estimators page](@ref probabilities_estimators).
These can be obtained from input data in a plethora of different ways.
The central API function that returns a probability distribution (or more precisely a probability mass function) is [`probabilities`](@ref), which takes in a subtype of [`ProbabilitiesEstimator`](@ref) to specify how the probabilities are computed.
All available estimators can be found in the [estimators page](@ref probabilities_estimators).

### Entropies

Expand All @@ -40,24 +39,28 @@ Thus, any of the implemented [probabilities estimators](@ref probabilities_estim

These names are commonplace, and so in Entropies.jl we provide convenience functions like [`entropy_wavelet`](@ref). However, it should be noted that these functions really aren't anything more than 2-lines-of-code wrappers that call [`entropy`](@ref) with the appropriate [`ProbabilitiesEstimator`](@ref).

In addition to `ProbabilitiesEstimators`, we also provide [`EntropyEstimator`](@ref)s,
which compute entropies via alternate means, without explicitly computing some
In addition to `ProbabilitiesEstimators`, we also provide [`EntropyEstimator`](@ref)s,
which compute entropies via alternate means, without explicitly computing some
probability distribution. Differential/continuous entropy, for example, is computed
using a dedicated [`EntropyEstimator`](@ref). For example, the [`Kraskov`](@ref)
estimator computes Shannon differential entropy via a nearest neighbor algorithm, while
the [`Zhu`](@ref) estimator computes Shannon differential entropy using order statistics.
using a dedicated [`EntropyEstimator`](@ref). For example, the [`Kraskov`](@ref)
estimator computes Shannon differential entropy via a nearest neighbor algorithm, while
the [`Correa`](@ref) estimator computes Shannon differential entropy using order statistics.

### Other complexity measures

Other complexity measures, which strictly speaking don't compute entropies, and may or may
not explicitly compute probability distributions, are found in
[Complexity.jl](https://github.com/JuliaDynamics/Complexity.jl) package. This includes
measures like sample entropy and approximate entropy.
Other complexity measures, which strictly speaking don't compute entropies, and may or may not explicitly compute probability distributions, are found in
[Complexity measures](@ref) page.
This includes measures like sample entropy and approximate entropy.

## [Input data for Entropies.jl](@id input_data)

The input data type typically depend on the probability estimator chosen. In general though, the standard DynamicalSystems.jl approach is taken and as such we have three types of input data:
The input data type typically depend on the probability estimator chosen.
In general though, the standard DynamicalSystems.jl approach is taken and as such we have three types of input data:

- _Timeseries_, which are `AbstractVector{<:Real}`, used in e.g. with [`WaveletOverlap`](@ref).
- _Multi-dimensional timeseries, or datasets, or state space sets_, which are [`Dataset`](@ref), used e.g. with [`NaiveKernel`](@ref).
- _Spatial data_, which are higher dimensional standard `Array`s, used e.g. with [`SpatialSymbolicPermutation`](@ref).

```@docs
Dataset
```
49 changes: 33 additions & 16 deletions docs/src/probabilities.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# [Probabilities](@id probabilities_estimators)
# Probabilities

## Probabilities API

Expand All @@ -8,67 +8,77 @@ The probabilities API is defined by
- [`probabilities`](@ref)
- [`probabilities_and_outcomes`](@ref)

and related functions that you will find in the following documentation blocks:

### Probabilitities

```@docs
ProbabilitiesEstimator
probabilities
probabilities!
Probabilities
```

### Outcomes

```@docs
probabilities_and_outcomes
outcomes
outcome_space
total_outcomes
missing_outcomes
```

## Overview
## [Overview of probabilities estimators](@id probabilities_estimators)

Any of the following estimators can be used with [`probabilities`](@ref).
Any of the following estimators can be used with [`probabilities`](@ref)
(in the column "input data" it is assumed that the `eltype` of the input is `<: Real`).

| Estimator | Principle | Input data |
| ------------------------------------------- | --------------------------- | ------------------- |
| [`CountOccurrences`](@ref) | Frequencies | `Vector`, `Dataset` |
|:--------------------------------------------|:----------------------------|:--------------------|
| [`CountOccurrences`](@ref) | Count of unique elements | `Any` |
| [`ValueHistogram`](@ref) | Binning (histogram) | `Vector`, `Dataset` |
| [`TransferOperator`](@ref) | Binning (transfer operator) | `Vector`, `Dataset` |
| [`NaiveKernel`](@ref) | Kernel density estimation | `Dataset` |
| [`SymbolicPermutation`](@ref) | Ordinal patterns | `Vector` |
| [`SymbolicWeightedPermutation`](@ref) | Ordinal patterns | `Vector` |
| [`SymbolicAmplitudeAwarePermutation`](@ref) | Ordinal patterns | `Vector` |
| [`SymbolicPermutation`](@ref) | Ordinal patterns | `Vector`, `Dataset` |
| [`SymbolicWeightedPermutation`](@ref) | Ordinal patterns | `Vector`, `Dataset` |
| [`SymbolicAmplitudeAwarePermutation`](@ref) | Ordinal patterns | `Vector`, `Dataset` |
| [`SpatialSymbolicPermutation`](@ref) | Ordinal patterns in space | `Array` |
| [`Dispersion`](@ref) | Dispersion patterns | `Vector` |
| [`SpatialDispersion`](@ref) | Dispersion patterns in space | `Array` |
| [`Diversity`](@ref) | Cosine similarity | `Vector` |
| [`WaveletOverlap`](@ref) | Wavelet transform | `Vector` |
| [`PowerSpectrum`](@ref) | Fourier spectra | `Vector`, `Dataset` |
| [`PowerSpectrum`](@ref) | Fourier transform | `Vector` |

## Count occurrences (counting)
## Count occurrences

```@docs
CountOccurrences
```

## Visitation frequency (histograms)
## Histograms

```@docs
ValueHistogram
RectangularBinning
FixedRectangularBinning
```

## Permutation (symbolic)
## Symbolic permutations

```@docs
SymbolicPermutation
SymbolicWeightedPermutation
SymbolicAmplitudeAwarePermutation
SpatialSymbolicPermutation
```

## Dispersion (symbolic)
## Dispersion patterns

```@docs
Dispersion
SpatialDispersion
```

## Transfer operator (binning)
## Transfer operator

```@docs
TransferOperator
Expand Down Expand Up @@ -100,3 +110,10 @@ PowerSpectrum
```@docs
Diversity
```

## Spatial estimators

```@docs
SpatialSymbolicPermutation
SpatialDispersion
```
4 changes: 2 additions & 2 deletions src/Entropies.jl
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@ include("complexity.jl")
include("multiscale.jl")

# Library implementations (files include other files)
include("encoding/all_encodings.jl") # other structs depend on these
include("probabilities_estimators/probabilities_estimators.jl")
include("entropies/entropies.jl")
include("encoding/all_encodings.jl")
include("complexity/complexity_measures.jl") # relies on encodings, so include after
include("complexity/complexity_measures.jl")
include("deprecations.jl")


Expand Down
2 changes: 2 additions & 0 deletions src/encoding/all_encodings.jl
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
include("fasthist.jl")
include("rectangular_binning.jl")
include("gaussian_cdf.jl")
include("ordinal_pattern.jl")
Loading

0 comments on commit cbb3319

Please sign in to comment.