Transitioning to Julian datasets and data loaders

I wanted to add to this the general discussion from Zulip (linked below). It's best we transfer concrete suggestions to here so it's available for everyone to contribute. The plan has evolved into two "milestones" — short-term and long-term. I'll summarize both below.

Short-term:
- [x] Implement iterable and map-like datasets à la PyTorch
- [x] Implement a DataLoader interface mimicking PyTorch
- The goal here is to take PyTorch implementations as-is to get something up and running, since this is a desperately needed functionality in the ecosystem
- Being tracked here: #5 

Long-term:
- [x] Transfer iterable dataset interface to [Base iterator interface](https://docs.julialang.org/en/v1/manual/interfaces/#man-interface-iteration-1) [1]
- [x] Transfer the map-like dataset interface to [Base indexable collections](https://docs.julialang.org/en/v1/base/collections/#Indexable-Collections-1) [1]
- The DataLoader interface doesn't necessarily need to change, but long term we should make sure that concrete implementations of the interface take advantage of things like Base.Random and Distributions for sampling

Relevant Zulip topics:
- https://julialang.zulipchat.com/#narrow/stream/237432-ml-ecosystem-coordination/topic/Datasets
- https://julialang.zulipchat.com/#narrow/stream/237432-ml-ecosystem-coordination/topic/DataLoaders

----
[1]: the Base interfaces may not be perfect for our needs, so we might need to build off of them

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Transitioning to Julian datasets and data loaders #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Transitioning to Julian datasets and data loaders #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions