Use assocreduce in mapcompute instead of splatting #73
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #72
There is a fair bit of overhead here, more than 100% more when the tree is really large. One argument why this could be acceptable is that people might not use the lazy mode unless they have heavy computations in which case the overhead should not be significant. I don't see
assocreduce
popping up in the profile viewer either, so maybe it is just more work for Dagger to move the partial results around.I suppose one could have a more coarse grained splitting of the thunks array than going all the way down to a single element (e.g. doing the current splatting on up to 100-ish elements at the time). Not sure if it is worth the effort and extra moving parts though.
Benchmarks
4 Threads, no Distributed:
Distributed with 4 workers (single thread per worker):