Skip to content
This repository was archived by the owner on Dec 1, 2025. It is now read-only.
This repository was archived by the owner on Dec 1, 2025. It is now read-only.

Dask Chunking Approach: Example in Documentation #65

@dougbrn

Description

@dougbrn

We've discussed there being motivation for a chunking approach as an alternative to sending massive task graphs to Dask. The main appeal being that chunking can potentially provide a more memory-stable compute at the cost of adding some looping overhead to the overall performance, which would help users that run into dask issues avoid dask troubleshooting as their only path forward.

@wilsonbb and I talked about this in more depth, and we came to the conclusion that the likely best output of this would be to have an example within our documentation that shows how one would do this on something like workflow in #42 . This is preferable to building a bespoke chunk function, as a built-in function would have many limitations regarding the graphs it can chunk (for example anything where a global value is computed) and therefore may set bad expectations for users. And building something that's more general would risk building an entire dask streaming interface that directly competes with Dask's workflow.

The first step to this is to actually verify that a chunking approach performs well, which @wilsonbb has agreed to explore as part of his exploration in #42

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions