Parallellization #216

kahaaga · 2022-12-22T17:55:41Z

In the following discussion, the issue of parallellization came up. This is a reminder of that.

    Probably best to use https://github.com/JuliaSIMD/Polyester.jl  because its loop is rather cheap. I will think about this once the API is super stable and 2.0 is out.

Originally posted by @Datseris in #213 (comment)

Things to think about:

How to paralellize? The same for all methods? Differently for some methods?
What about GPU compatibility? Can it be done? For which methods? Generic or backend-specific? I.e. what happens if I'm using a mac vs a windows OS?

The text was updated successfully, but these errors were encountered:

kahaaga · 2022-12-26T00:31:08Z

Some preliminary results, starting julia with 8 threads:

using DelayEmbeddings, Entropies
m, τ, N = 7, 1, 1000000
est = SymbolicPermutation(; m, τ)
x = Dataset(rand(N, m)) # timeseries example
πs_ts = zeros(Int, N); # length must match length of `x`;

using BenchmarkTools, Test
probabilities!(πs_ts, est, x);
probabilities_parallel!(πs_ts, est, x);
probabilities_parallel_batch!(πs_ts, est, x);
@btime pn = probabilities!($πs_ts, $est, $x) # No threads
@btime pp = probabilities_parallel!($πs_ts, $est, $x) # Threads.@threads
@btime pb = probabilities_parallel_batch!($πs_ts, $est, $x) # Polyester.@batch, no configuration

> 85.572 ms (7 allocations: 7.71 MiB)
> 44.202 ms (49 allocations: 4.36 KiB) 
> 37.254 ms (1 allocation: 48 bytes

It definitely seems that there is some performance gains to be made here. Some more sensitivity analyses are needed before settling on anything.

Datseris · 2022-12-26T07:58:01Z

what's the code though

Datseris · 2022-12-26T07:58:29Z

and why does the pure probabilities! allocate...?

Datseris · 2022-12-26T08:00:25Z

Notice that you cant thread without care. The method uses the internal pre-allocated perm array stored in the estimator. To parallelize you would need as many copies of this array as nthreads(). I would guess the results you get would be wrong otherwise.

kahaaga changed the title ~~Probably best to use https://github.com/JuliaSIMD/Polyester.jl because its loop is rather cheap. I will think about this once the API is super stable and 2.0 is out.~~ Parallellization Dec 22, 2022

kahaaga added improvement Improvement of an existing feature low priority This isn't particuarly important right now. performance labels Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallellization #216

Parallellization #216

kahaaga commented Dec 22, 2022 •

edited

Loading

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022 •

edited

Loading

Parallellization #216

Parallellization #216

Comments

kahaaga commented Dec 22, 2022 • edited Loading

kahaaga commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022

Datseris commented Dec 26, 2022 • edited Loading

kahaaga commented Dec 22, 2022 •

edited

Loading

Datseris commented Dec 26, 2022 •

edited

Loading