feat(cache): separate python installation from base image by adding pre-built remote cache #1360

Kaiyang-Chen · 2022-12-30T23:47:16Z

Description

For current LLB compilation (show in below figure), we pull the base(custom) image in first layer, which means if the user change the base image (using different cuda version / different os, etc.), all caches from previous built will miss.

Under my network condition, the user group creation & python installation with conda took around 1 minute. I think such operation can be sped up by leveraging the pre-built remote cache for different python version from a fixed image. Demonstrate in the figure below, whenever the user is changing the base image, we can simply pull the llb.Diff(fixStage, pythonStage) caches and perform llb.Merge() between it and the base image.

Potential problems

Not sure whether buildkit support output the llb.Diff() layer, but if not, can walk around by caching pythonStage and do the llb.Diff() manually.
The above method modify etc/passwd & etc/usergroup when creating user group, when merging with base image, if conflict exist in such files between different os, there might be problems.

Other thoughts

If outputting llb.Diff() layer is possible, we might be able to pre-built caches for large package like pytorch, cuda-related components and use them as plug-in for base image. Since package downloading take significant time when building up docker environment, this should speed up the build process a lot.

Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.

The text was updated successfully, but these errors were encountered:

VoVAllen · 2023-01-03T02:57:22Z

Thanks for your contribution! I think the core problem here is at buildkit side, how we can inspect the llb.Diff node, and whether it's possible to export it separately. Can you raise the question at the buildkit repo and link it here also? Thanks!

kemingy · 2023-01-03T03:24:11Z

LLB Merge could be problematic when there are some overlapped directories.
Maintaining remote cache for different Python versions need also consider the security update.

You need to check the v1 graph. It should support Python w/wo Conda/Mamba.

gaocegege · 2023-01-03T08:20:20Z

Thanks for the proposal!

We can optimize the workflow further. For example, we can investigate if we could merge the pytorch/tensorflow package into the environment image directly, instead of downloading and installing it from pypi.

The tf/torch packages are too large. it may be faster to keep a remote cache for them.

VoVAllen · 2023-01-03T09:19:30Z

And also starship package, it used github domain to host packages, which is hard to install when network issue exists when we don't have cache

gaocegege · 2023-01-03T09:24:25Z

Yep. starship. It is hard to install here in CN.

Kaiyang-Chen added the type/feature 💡 label Dec 30, 2022

tensorchord bot added type/documentation 📄 type/enhancement 💭 area/docker 🐳 area/buildkit 🚢 labels Dec 30, 2022

Kaiyang-Chen changed the title ~~feat: separate python installation from base image by adding pre-built remote cache~~ feat(cache): separate python installation from base image by adding pre-built remote cache Dec 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cache): separate python installation from base image by adding pre-built remote cache #1360

feat(cache): separate python installation from base image by adding pre-built remote cache #1360

Kaiyang-Chen commented Dec 30, 2022

VoVAllen commented Jan 3, 2023

kemingy commented Jan 3, 2023

gaocegege commented Jan 3, 2023

VoVAllen commented Jan 3, 2023

gaocegege commented Jan 3, 2023

feat(cache): separate python installation from base image by adding pre-built remote cache #1360

feat(cache): separate python installation from base image by adding pre-built remote cache #1360

Comments

Kaiyang-Chen commented Dec 30, 2022

Description

Potential problems

Other thoughts

VoVAllen commented Jan 3, 2023

kemingy commented Jan 3, 2023

gaocegege commented Jan 3, 2023

VoVAllen commented Jan 3, 2023

gaocegege commented Jan 3, 2023