Support for parallel upload of large artifacts

# Issue

The OCI spec limits upload of artifacts to be serial in nature. You need to process the upload in-order because we need to calculate the checksum of the incoming data. The sha256 algorithm in not "distributable" it needs to be run serially. This creates problems when uploading large artifacts like ML models. This makes using OCI for large artifacts unattractive as compared to uploading in an object storage like S3 which supports multipart uploads.

There have been [prior attempts](https://github.com/opencontainers/distribution-spec/issues/546) at addressing this where the idea was to support out-of-order chunked uploads. This however leaves the assembly and checksum validation on the registry which might take a long time to do this or may not have resources to pull a large blob in-memory/on disk to calculate the final checksum

# Use cases

Large artifacts are becoming prevalent in the OCI space. A few examples

* AI models
* VM images
* DB backups
* Binaries for libs 

# Proposal

The proposal is to introduce a new layer mediaType which is an indirection to a index of chunks. 

![Image](https://github.com/user-attachments/assets/3d4a0fc9-8365-43e8-a04a-c717e17333d6)


The chunk-index is another blob which holds a list of chunks with their sizes and offsets. The clients, when uploading will chunk the file can upload each chunk in parallel. Once all the chunks are uploaded, the client creates the chunk index and pushes that and then finally creates the manifest which references the blob chunk index.

The advantage here is that there is no extra processing required on the registry side. There is no reassembly required on the server which means the blob can be "committed" on the registry as soon as the final chunk is uploaded.

# Considerations/Issues

1. Older clients and registries will not be able to support this and we have to find some way of being backwards compatible. 
1. Since there is no full reassembly on the registry. The full artifact is only available when assembly is done on the client side (as opposed to S3 where reassembly is done on S3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for parallel upload of large artifacts #573

Issue

Use cases

Proposal

Considerations/Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for parallel upload of large artifacts #573

Description

Issue

Use cases

Proposal

Considerations/Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions