Skip to content

Replace compressed tar with format for random access #936

@abitrolly

Description

@abitrolly

I am rewriting code for go-containers to allow streaming access to container image, and I'd like to know if it is not too late to add parsing streamed container images into https://github.com/opencontainers/image-spec/blob/main/considerations.md ?

The particular problem for parsing image stream is the tar format, which, if compressed, requires to be fully processed to read table of contents. Quoting https://en.wikipedia.org/wiki/Tar_(computing)#Random_access

The tar format was designed without a centralized index or table of content for files and their properties for streaming to tape backup devices. The archive must be read sequentially to list or extract files. For large tar archives, this causes a performance penalty, making tar archives unsuitable for situations that often require random access to individual files.

Which means that if manifest.json is located at the end of stream, the whole image needs to be downloaded (often into memory) and decompressed. Analyzing image requires several passes for reconstructing the final usable state. It it impossible to just list files without downloading and decompressing whole layers. Which leads to waste of cloud resources.

Adding the requirement of being able to parse image in one go, without necessarily downloading the whole contents, will help to define "well-formed OCI image" format that could be then enforced by registries on push. I've seen that specification mentions zip and probably allows for alternative compression methods, but I don't believe many would implement that if tar continues to be the de-facto standard.

Because the spec reached 1.0 version, I am interested to know if changes to the spec are still acceptable?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions