Skip to content

feat(generic): add authentication support for artifact downloads#1525

Open
ST3V1K wants to merge 4 commits intohermetoproject:mainfrom
ST3V1K:feat/generic-auth
Open

feat(generic): add authentication support for artifact downloads#1525
ST3V1K wants to merge 4 commits intohermetoproject:mainfrom
ST3V1K:feat/generic-auth

Conversation

@ST3V1K
Copy link
Copy Markdown
Collaborator

@ST3V1K ST3V1K commented Apr 20, 2026

The generic fetcher does not support authentication, making it impossible to download artifacts from private registries (e.g. private GitLab instances).

This PR adds per-artifact authentication to the generic fetcher:

  • HTTP Bearer and Basic auth via an auth block in artifacts.lock.yaml, with environment variable interpolation ($VAR / ${VAR}) for secrets
  • Lockfile version 2.0 schema to carry auth configuration; version 1.0 remains supported and rejects auth fields with a clear error
  • Per-URL header injection into aiohttp downloads, allowing mixed authenticated and unauthenticated artifacts in the same lockfile

A follow-up PRs will be submitted to https://github.com/hermetoproject/integration-tests.

Resolves: #1224
Design: #1247

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements per-artifact authentication for the generic fetcher, introducing lockfile version 2.0 with support for Bearer and HTTP Basic auth. The implementation includes environment variable interpolation for secrets and refactors the download utility to handle per-URL headers. Feedback suggests enhancing the environment variable regex to support lowercase names, improving validation logic to prevent misleading error messages when extra fields are present, and simplifying the header generation method.

Comment thread hermeto/core/package_managers/generic/models.py Outdated
@model_validator(mode="before")
@classmethod
def _check_mutually_exclusive(cls, values: dict) -> dict:
if sum(1 for v in values.values() if v is not None) != 1:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The mutually exclusive check iterates over all keys in the input dictionary. If a user provides an extra field (which is forbidden by the model configuration), this validator might raise a misleading error message about multiple auth types instead of the specific error about the extra field. It is better to explicitly check only the known authentication fields.

Suggested change
if sum(1 for v in values.values() if v is not None) != 1:
if sum(1 for k in ("basic", "bearer") if values.get(k) is not None) != 1:

Comment thread hermeto/core/package_managers/generic/models.py Outdated
@ST3V1K ST3V1K force-pushed the feat/generic-auth branch 2 times, most recently from d0557fa to 73ffa42 Compare April 28, 2026 07:03
Comment thread hermeto/core/package_managers/generic/main.py Outdated
Comment thread hermeto/core/package_managers/generic/models.py
return os.environ[var_name]
raise ValueError(f"Environment variable {var_name} is not set")

return re.sub(ENV_VAR_PATTERN, get_env_var, value).replace("\\$", "$")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two built-in alternatives to implementing this manually, either os.path.expandvars(<input_str>) for rather straightforward substitution (does not raise when a key is missing), or a more involved string.Template.substitute(os.environ) which raises KeyError.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With os.path.expandvars(<input_str>) (or string.Template.safe_substitute) there is the issue with them not throwing any error on missing vars -

required by a given platform REST API spec (e.g. [Gitea][gitea-auth]). Hermeto
fails with a clear error message if any of the referenced environment variables
is unset.

And with string.Template.substitute, the issue is that there is no way to have $ directly in the string. I doubt that anyone would need that. But if, for some reason, anyone would need that, there would be no way to do it as far as I know.
With my original design, you can escape it \$.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I did not know about these functions beforehand, and they seem useful. Thank you

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, however is there an indication that someone would actually need to embed a $ into any of the fields?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is not. I'll replace it with string.Template.substitute.

@model_validator(mode="before")
@classmethod
def _check_mutually_exclusive(cls, values: dict) -> dict:
if sum(1 for v in values.values() if v is not None) != 1:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are just two options here, values["basic"] and values["bearer"], WDYT about stating what you mean directly:

# direct check:
if (values["basic"] is not None and values["bearer"] is not None) or
        (values["basic"] is None and values["bearer"] is None):
    raise ValueError("Exactly one of the auth types must be set")
# less direct check:
supplied_auth_types = [values["basic"], values["bearer"]
if all(supplied_auth_types) or not any(supplied_auth_types):
    raise ValueError("Exactly one of the auth types must be set")

sum is a very nice tool and could be used for pretty much anything, but sometimes it just requires some extra effort from a reader to process, it is so much better to have an easy to follow code in the long term!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to design it with an arbitrary number of auth types in mind. If anyone in the future wanted to add a new type, they could do that without modifying these model validators.

If there were only two types, I agree that this would be better.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have either kept it simple or wrapped sum(... in a helper function with an appropriate name, something like precisely_one_not_none(values.values()).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I kept it simple.

return

# NOTE: when present proxy auth is the same for all packages accessible
# through a proxy.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note rationalizes the next(iter(...)) construct, please move it along.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since more_itertools is not yet present before merging #1408, the method first is therefore unavailable.
At this time, I don't really know how I should address this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my comment was not clear enough. Please move the comment along with the line it comments on, i.e. please place it inside def headers(...) right above auth = next(iter(...))[... line

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I have that PR in mind all the time for some reason.

@ST3V1K ST3V1K force-pushed the feat/generic-auth branch from 73ffa42 to 1f371d4 Compare May 4, 2026 11:15
return os.environ[var_name]
raise ValueError(f"Environment variable {var_name} is not set")

return re.sub(ENV_VAR_PATTERN, get_env_var, value).replace("\\$", "$")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, however is there an indication that someone would actually need to embed a $ into any of the fields?

@model_validator(mode="before")
@classmethod
def _check_mutually_exclusive(cls, values: dict) -> dict:
if sum(1 for v in values.values() if v is not None) != 1:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have either kept it simple or wrapped sum(... in a helper function with an appropriate name, something like precisely_one_not_none(values.values()).

)

version = None
if isinstance(lockfile_data, dict):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially pre-validation of data, I would have considered trying to let pydantic do it. WDUT about adding some grouping to the models? I ran a quick test and got this:

>>> GLF = Annotated[Union[GenericLockfileV1, GenericLockfileV2], Field(Discriminator='metadata')]
>>> TypeAdapter(GLF).validate_python({'metadata': {'version': "1.0"}, "artifacts": []})
GenericLockfileV1(metadata=LockfileMetadata(version='1.0'), artifacts=[])
>>> TypeAdapter(GLF).validate_python({'metadata': {'version': "2.0"}, "artifacts": []})
GenericLockfileV2(metadata=LockfileMetadataV2(version='2.0'), artifacts=[])
>>> TypeAdapter(GLF).validate_python({'metadata': {'version': "3.0"}, "artifacts": []})
Traceback (most recent call last):
    ...
pydantic_core._pydantic_core.ValidationError: ...
>>> TypeAdapter(GLF).validate_python("")
Traceback (most recent call last):
    ...
pydantic_core._pydantic_core.ValidationError: ...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to do it a little differently, but I got it to work. I was getting some error, because metadata was not a Literal

I never really used Pydantic. I should learn it a bit more. Thank you for the suggestions.

return

# NOTE: when present proxy auth is the same for all packages accessible
# through a proxy.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my comment was not clear enough. Please move the comment along with the line it comments on, i.e. please place it inside def headers(...) right above auth = next(iter(...))[... line

Support basic and bearer authentication in generic lockfiles, with
per-URL headers and environment variable expansion in credentials.
Refactor async_download_files to use per-URL headers instead of a
single shared auth parameter.

Closes: hermetoproject#1224
Design: hermetoproject#1247

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
@ST3V1K ST3V1K force-pushed the feat/generic-auth branch from 1f371d4 to 9564304 Compare May 7, 2026 07:18
ST3V1K added 3 commits May 7, 2026 12:40
With the new generic authentication support we needed a way to test
Bearer tokens. Adding a new nginx server with a static Bearer token
should be enough for this.

Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
Add tests for BasicAuth, BearerAuth, and LockfileArtifactAuth models,
lockfile parsing with auth sections, and integration tests for basic
auth, bearer auth, and wrong credentials scenarios.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
Signed-off-by: Jakub Nekvinda <jnekvind@redhat.com>
@ST3V1K ST3V1K force-pushed the feat/generic-auth branch from 9564304 to cc9390b Compare May 7, 2026 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Support authentication for generic backend downloads in prefetch-dependencies

2 participants