Skip to content

Transitioning to Julian datasets and data loaders #7

@darsnack

Description

@darsnack

I wanted to add to this the general discussion from Zulip (linked below). It's best we transfer concrete suggestions to here so it's available for everyone to contribute. The plan has evolved into two "milestones" — short-term and long-term. I'll summarize both below.

Short-term:

  • Implement iterable and map-like datasets à la PyTorch
  • Implement a DataLoader interface mimicking PyTorch
  • The goal here is to take PyTorch implementations as-is to get something up and running, since this is a desperately needed functionality in the ecosystem
  • Being tracked here: Implement Dataloader interface #5

Long-term:

  • Transfer iterable dataset interface to Base iterator interface [1]
  • Transfer the map-like dataset interface to Base indexable collections [1]
  • The DataLoader interface doesn't necessarily need to change, but long term we should make sure that concrete implementations of the interface take advantage of things like Base.Random and Distributions for sampling

Relevant Zulip topics:


[1]: the Base interfaces may not be perfect for our needs, so we might need to build off of them

Metadata

Metadata

Assignees

No one assigned

    Labels

    parity:pytorchNeeded for feature parity with PyTorch

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions