Skip to content
This repository was archived by the owner on Apr 25, 2023. It is now read-only.
This repository was archived by the owner on Apr 25, 2023. It is now read-only.

refactor intermediate data formats #36

@bryantChhun

Description

@bryantChhun

Issue

Before we can think about enhanced parallelization and pytorch dataloaders, we need to rethink the data formats for dynamorph.

For each stage of the pipeline, we should define the data type inputs and outputs better (file format, dimensionality, file name)

considerations

We primarily need:

  1. data consistency between each stage
  2. parallelization
  3. efficiency (compute and loading. zarr caching?)

questions

Can we avoid data duplication? Are there intermediate stages that can avoid data duplication?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions