-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for locking kernels #10
Conversation
This change allows Python projects that use kernels to lock the kernel revisions on a project-basis. For this to work, the user only has to include `hf-kernels` as a build dependency. During the build, a lock file is written to the package's pkg-info. During runtime we can read it out and use the corresponding revision. When the kernel is not locked, the revision that is provided as an argument is used.
c5266da
to
37be1e9
Compare
if locked_sha is None: | ||
raise ValueError(f"Kernel `{repo_id}` is not locked") | ||
|
||
package_name, package_path = install_kernel( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
load_kernel
is supposed to ignore entirely the download. This reintroduces it.
hf_hub_downlod(..., local_files_only=True)
is the only public API I found to IGNORE downloading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't have the model cached and use load_kernel
, I get the following error:
% python -c 'import kernel_test; kernel_test.run()'
[...]
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/daniel/.cache/huggingface/hub/models--kernels-community--activation/snapshots/a71853ecbdd899526f9810cc558ee24081a6302e/build/torch25-cxx98-cu124-x8
6_64-linux/activation/__init__.py'
The error could be better, but it doesn't seem to download?
I did forget to pass through local_files_only
to get_metadata
. Pushing a fix for that now. Then it fails even earlier:
% python -c 'import kernel_test; kernel_test.run()'
[...]
huggingface_hub.errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is let's keep the actual code local_files_only
separate. Otherwise it's super easy to screw up and reintroduce the internet connection. (Better yet if we could sidestep that bad API altogether).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, made separate again.
|
||
file_locks = [] | ||
for sibling in r.siblings: | ||
if sibling.rfilename.startswith("build/torch"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we filter by version too here ? (Maybe subsequent PR ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lockfile should contain all build variants, because we don't know what Torch/CUDA version a downstream user will have.
Or did you mean kernel version? If that, one particular commit is only supposed to have one version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I meant kernel version/version range.
return | ||
|
||
lock_path = cwd / "hf-kernels.lock" | ||
if not lock_path.exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that always true ?
How can the lockfile exists before it's created ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could happen in an editable install:
- Create the lockfile.
- Do an editable install. (the lockfile gets written into the package info)
- Remove the lockfile.
- Do an editable install.
Though I still need to check whether this is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to add, this checks the existence of the lock file in the project's source directory, which should exist prior to running the install if you want to lock the versions.
hf-kernels lock .
locks the kernels specified in a project'spyproject.toml
. Building a package with thesetuptools
build backend, which is the default forpyproject.toml
, will add the lock file to the package's metadata. The a kernel can then be downloaded and loaded at the locked version usingget_locked_kernel
.hf-kernels download .
downloads the locked kernels to the HF cache directory. The kernel can then be loaded usingload_kernel
(which is a small wrapper forget_locked_kernel
).PR uploaded as
hf-kernels
0.1.1 to PyPI for testing.In the future, we want to be able to specify versions in
pyproject.toml
(which then get locked), but that's for a later PR.