-
Notifications
You must be signed in to change notification settings - Fork 225
Description
Issue
The OCI spec limits upload of artifacts to be serial in nature. You need to process the upload in-order because we need to calculate the checksum of the incoming data. The sha256 algorithm in not "distributable" it needs to be run serially. This creates problems when uploading large artifacts like ML models. This makes using OCI for large artifacts unattractive as compared to uploading in an object storage like S3 which supports multipart uploads.
There have been prior attempts at addressing this where the idea was to support out-of-order chunked uploads. This however leaves the assembly and checksum validation on the registry which might take a long time to do this or may not have resources to pull a large blob in-memory/on disk to calculate the final checksum
Use cases
Large artifacts are becoming prevalent in the OCI space. A few examples
- AI models
- VM images
- DB backups
- Binaries for libs
Proposal
The proposal is to introduce a new layer mediaType which is an indirection to a index of chunks.
The chunk-index is another blob which holds a list of chunks with their sizes and offsets. The clients, when uploading will chunk the file can upload each chunk in parallel. Once all the chunks are uploaded, the client creates the chunk index and pushes that and then finally creates the manifest which references the blob chunk index.
The advantage here is that there is no extra processing required on the registry side. There is no reassembly required on the server which means the blob can be "committed" on the registry as soon as the final chunk is uploaded.
Considerations/Issues
- Older clients and registries will not be able to support this and we have to find some way of being backwards compatible.
- Since there is no full reassembly on the registry. The full artifact is only available when assembly is done on the client side (as opposed to S3 where reassembly is done on S3)