feat(generic): add authentication support for artifact downloads#1525
feat(generic): add authentication support for artifact downloads#1525ST3V1K wants to merge 4 commits intohermetoproject:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements per-artifact authentication for the generic fetcher, introducing lockfile version 2.0 with support for Bearer and HTTP Basic auth. The implementation includes environment variable interpolation for secrets and refactors the download utility to handle per-URL headers. Feedback suggests enhancing the environment variable regex to support lowercase names, improving validation logic to prevent misleading error messages when extra fields are present, and simplifying the header generation method.
| @model_validator(mode="before") | ||
| @classmethod | ||
| def _check_mutually_exclusive(cls, values: dict) -> dict: | ||
| if sum(1 for v in values.values() if v is not None) != 1: |
There was a problem hiding this comment.
The mutually exclusive check iterates over all keys in the input dictionary. If a user provides an extra field (which is forbidden by the model configuration), this validator might raise a misleading error message about multiple auth types instead of the specific error about the extra field. It is better to explicitly check only the known authentication fields.
| if sum(1 for v in values.values() if v is not None) != 1: | |
| if sum(1 for k in ("basic", "bearer") if values.get(k) is not None) != 1: |
d0557fa to
73ffa42
Compare
| return os.environ[var_name] | ||
| raise ValueError(f"Environment variable {var_name} is not set") | ||
|
|
||
| return re.sub(ENV_VAR_PATTERN, get_env_var, value).replace("\\$", "$") |
There was a problem hiding this comment.
There are two built-in alternatives to implementing this manually, either os.path.expandvars(<input_str>) for rather straightforward substitution (does not raise when a key is missing), or a more involved string.Template.substitute(os.environ) which raises KeyError.
There was a problem hiding this comment.
With os.path.expandvars(<input_str>) (or string.Template.safe_substitute) there is the issue with them not throwing any error on missing vars -
hermeto/docs/design/generic-auth.md
Lines 179 to 181 in cc532aa
And with string.Template.substitute, the issue is that there is no way to have $ directly in the string. I doubt that anyone would need that. But if, for some reason, anyone would need that, there would be no way to do it as far as I know.
With my original design, you can escape it \$.
There was a problem hiding this comment.
However, I did not know about these functions beforehand, and they seem useful. Thank you
There was a problem hiding this comment.
That's a good point, however is there an indication that someone would actually need to embed a $ into any of the fields?
There was a problem hiding this comment.
No, there is not. I'll replace it with string.Template.substitute.
| @model_validator(mode="before") | ||
| @classmethod | ||
| def _check_mutually_exclusive(cls, values: dict) -> dict: | ||
| if sum(1 for v in values.values() if v is not None) != 1: |
There was a problem hiding this comment.
There are just two options here, values["basic"] and values["bearer"], WDYT about stating what you mean directly:
# direct check:
if (values["basic"] is not None and values["bearer"] is not None) or
(values["basic"] is None and values["bearer"] is None):
raise ValueError("Exactly one of the auth types must be set")
# less direct check:
supplied_auth_types = [values["basic"], values["bearer"]
if all(supplied_auth_types) or not any(supplied_auth_types):
raise ValueError("Exactly one of the auth types must be set")sum is a very nice tool and could be used for pretty much anything, but sometimes it just requires some extra effort from a reader to process, it is so much better to have an easy to follow code in the long term!
There was a problem hiding this comment.
I tried to design it with an arbitrary number of auth types in mind. If anyone in the future wanted to add a new type, they could do that without modifying these model validators.
If there were only two types, I agree that this would be better.
There was a problem hiding this comment.
I would have either kept it simple or wrapped sum(... in a helper function with an appropriate name, something like precisely_one_not_none(values.values()).
There was a problem hiding this comment.
Okay, I kept it simple.
| return | ||
|
|
||
| # NOTE: when present proxy auth is the same for all packages accessible | ||
| # through a proxy. |
There was a problem hiding this comment.
This note rationalizes the next(iter(...)) construct, please move it along.
There was a problem hiding this comment.
Since more_itertools is not yet present before merging #1408, the method first is therefore unavailable.
At this time, I don't really know how I should address this.
There was a problem hiding this comment.
Sorry, my comment was not clear enough. Please move the comment along with the line it comments on, i.e. please place it inside def headers(...) right above auth = next(iter(...))[... line
There was a problem hiding this comment.
My bad, I have that PR in mind all the time for some reason.
| return os.environ[var_name] | ||
| raise ValueError(f"Environment variable {var_name} is not set") | ||
|
|
||
| return re.sub(ENV_VAR_PATTERN, get_env_var, value).replace("\\$", "$") |
There was a problem hiding this comment.
That's a good point, however is there an indication that someone would actually need to embed a $ into any of the fields?
| @model_validator(mode="before") | ||
| @classmethod | ||
| def _check_mutually_exclusive(cls, values: dict) -> dict: | ||
| if sum(1 for v in values.values() if v is not None) != 1: |
There was a problem hiding this comment.
I would have either kept it simple or wrapped sum(... in a helper function with an appropriate name, something like precisely_one_not_none(values.values()).
| ) | ||
|
|
||
| version = None | ||
| if isinstance(lockfile_data, dict): |
There was a problem hiding this comment.
This is essentially pre-validation of data, I would have considered trying to let pydantic do it. WDUT about adding some grouping to the models? I ran a quick test and got this:
>>> GLF = Annotated[Union[GenericLockfileV1, GenericLockfileV2], Field(Discriminator='metadata')]
>>> TypeAdapter(GLF).validate_python({'metadata': {'version': "1.0"}, "artifacts": []})
GenericLockfileV1(metadata=LockfileMetadata(version='1.0'), artifacts=[])
>>> TypeAdapter(GLF).validate_python({'metadata': {'version': "2.0"}, "artifacts": []})
GenericLockfileV2(metadata=LockfileMetadataV2(version='2.0'), artifacts=[])
>>> TypeAdapter(GLF).validate_python({'metadata': {'version': "3.0"}, "artifacts": []})
Traceback (most recent call last):
...
pydantic_core._pydantic_core.ValidationError: ...
>>> TypeAdapter(GLF).validate_python("")
Traceback (most recent call last):
...
pydantic_core._pydantic_core.ValidationError: ...There was a problem hiding this comment.
I had to do it a little differently, but I got it to work. I was getting some error, because metadata was not a Literal
I never really used Pydantic. I should learn it a bit more. Thank you for the suggestions.
| return | ||
|
|
||
| # NOTE: when present proxy auth is the same for all packages accessible | ||
| # through a proxy. |
There was a problem hiding this comment.
Sorry, my comment was not clear enough. Please move the comment along with the line it comments on, i.e. please place it inside def headers(...) right above auth = next(iter(...))[... line
Support basic and bearer authentication in generic lockfiles, with per-URL headers and environment variable expansion in credentials. Refactor async_download_files to use per-URL headers instead of a single shared auth parameter. Closes: hermetoproject#1224 Design: hermetoproject#1247 Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
With the new generic authentication support we needed a way to test Bearer tokens. Adding a new nginx server with a static Bearer token should be enough for this. Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
Add tests for BasicAuth, BearerAuth, and LockfileArtifactAuth models, lockfile parsing with auth sections, and integration tests for basic auth, bearer auth, and wrong credentials scenarios. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
The generic fetcher does not support authentication, making it impossible to download artifacts from private registries (e.g. private GitLab instances).
This PR adds per-artifact authentication to the generic fetcher:
authblock inartifacts.lock.yaml, with environment variable interpolation ($VAR/${VAR}) for secretsA follow-up PRs will be submitted to https://github.com/hermetoproject/integration-tests.
Resolves: #1224
Design: #1247