Dask Chunking Approach: Example in Documentation

We've discussed there being motivation for a chunking approach as an alternative to sending massive task graphs to Dask. The main appeal being that chunking can potentially provide a more memory-stable compute at the cost of adding some looping overhead to the overall performance, which would help users that run into dask issues avoid dask troubleshooting as their only path forward.

@wilsonbb and I talked about this in more depth, and we came to the conclusion that the likely best output of this would be to have an example within our documentation that shows how one would do this on something like workflow in #42 . This is preferable to building a bespoke `chunk` function, as a built-in function would have many limitations regarding the graphs it can chunk (for example anything where a global value is computed)  and therefore may set bad expectations for users. And building something that's more general would risk building an entire dask streaming interface that directly competes with Dask's workflow.

The first step to this is to actually verify that a chunking approach performs well, which @wilsonbb has agreed to explore as part of his exploration in #42 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dask Chunking Approach: Example in Documentation #65

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dask Chunking Approach: Example in Documentation #65

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions