Skip to content

Implement PUT /api/v1/trusted_publishing/tokens API endpoint #11131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

Turbo87
Copy link
Member

@Turbo87 Turbo87 commented May 8, 2025

This is a big one... I'm sorry! 😅

This PR implements the token exchange API endpoint used for the Trusted Publishing flow. This is roughly how it works:

  • The endpoint accepts a JSON web token, issued by the GitHub Actions runner.
  • It looks at the iss claim (Issuer) to figure out who issued this token and compares it against a list of supported Trusted Publishing providers (configurable via TRUSTPUB_PROVIDERS, but currently only supporting GitHub Actions).
  • If a supported provider is found, it loads the JWKS (JSON Web Key Set) from their server, to validate that the JWT was indeed issued by that provider.
  • It uses the jti claim from the JWT to prevent token reuse by attempting to insert a new row into the trustpub_used_jtis table, or returns an error if a uniqueness constraint is triggered.
  • It then checks the trustpub_configs_github table for a configuration matching the other claims in the JWT
  • If any configurations are found a temporary access token is generated, saved to the trustpub_tokens table, and then returned as the API response.

As with the previous PRs, this implementation is largely inspired by the corresponding implementation in https://github.com/pypi/warehouse.

I've prepared the code to be somewhat provider-agnostic, but at this point the endpoint is still hardcoded to only support GitHub Actions. Refactoring this to completion can be done once we add support for another provider.

With that context I hope the PR is somewhat reviewable. It is definitely best reviewed commit by commit.

Related:

Turbo87 added 20 commits May 8, 2025 09:54
This fn can be used to decode a JSON web token without verifying it's signature or claims. Only the `iss` claim will actually be decoded, since we use that to find the correct decoding key for the JWT issuer.
This defaults to the domain name (crates.io / staging.crates.io) and controls the expected `aud` claim of the OIDC JWT in the Trusted Publishing token exchange.
@Turbo87 Turbo87 added C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works A-backend ⚙️ labels May 8, 2025
@Turbo87 Turbo87 requested review from LawnGnome and a team May 8, 2025 07:58
pub struct AccessToken(SecretString);

impl AccessToken {
const PREFIX: &str = "cio_tp_";
Copy link
Member Author

@Turbo87 Turbo87 May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for regular tokens we use cio plus 32 alphanumeric characters. when we implemented GitHub Secret Scanning we were told to use a more unique prefix with a separator. therefore the temporary access tokens are using the cio_to_ prefix to make them easier to recognize with less false positives.

Comment on lines +13 to +25
/// Generate a new access token.
pub fn generate() -> Self {
Self::from_u64s(rand::random(), rand::random())
}

/// Create an access token from two u64 values.
///
/// This is used internally by the `generate()` fn and is extracted
/// to a separate function for testing purposes.
fn from_u64s(r1: u64, r2: u64) -> Self {
let plaintext = format!("{}{r1:016x}{r2:016x}", Self::PREFIX);
Self(SecretString::from(plaintext))
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • is 128bits of randomness enough?

Yep, 128 bits of entropy is sufficient! You could also do 256 bits to match what PyPI does for its tokens, but that's IMO overkill.

suggests adding a (32bit) checksum at the end for validation purposes. That would require us to find and add yet another dependency. Not sure if that's really worth it? 🤔

I think it depends partially on expected load/how expensive you expect credential revocation to be -- if there's a mass revocation event where GitHub is sending you tends of thousands of potential credentials, it can be helpful to have a pre-DB filter that weeds out false positives.

(This of course doesn't prevent someone from spamming you with fake tokens, since a CRC32 or similar is easy to crunch. But it limits the spam computationally, i.e. requires the spammer to waste time on that and keeps them from having a multiplicative impact on the DB itself.)

TL;DR: if the DB roundtrip is or could become expensive for token revocation, then a checksum or check-digit sequence is a good idea. Otherwise, you could probably leave it out of the MVP and version it in later 🙂

(I haven't used it myself, but https://crates.io/crates/crc32fast might be a good candidate?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past, I've generally regretted not adding in the support for the checksum. 😓 That said, the tokens often looked like other hashes, so there were a lot of false positive submissions.

Since these have the unique prefix the number of submissions should end up restricted to things that actually look like tokens.

TL;DR: if the DB roundtrip is or could become expensive for token revocation, then a checksum or check-digit sequence is a good idea. Otherwise, you could probably leave it out of the MVP and version it in later 🙂

+1 to leaving this out for the MVP and only implementing if it becomes an issue later on.

Comment on lines +57 to +87
Self {
iss: GITHUB_ISSUER_URL.into(),
nbf: now,
iat: now,
exp: now + 30 * 60,
jti: "example-id".into(),
sub: format!("repo:{owner_name}/{repository_name}"),
aud: AUDIENCE.into(),

environment: environment.map(|s| s.into()),
r#ref: "refs/heads/main".into(),
sha: "example-sha".into(),
repository: format!("{owner_name}/{repository_name}"),
repository_owner: owner_name.into(),
actor_id: "12".into(),
repository_visibility: "private".into(),
repository_id: "74".into(),
repository_owner_id: owner_id.to_string(),
run_id: "example-run-id".into(),
run_number: "10".into(),
run_attempt: "2".into(),
runner_environment: "github-hosted".into(),
actor: "octocat".into(),
workflow: "example-workflow".into(),
head_ref: "".into(),
base_ref: "".into(),
event_name: "workflow_dispatch".into(),
ref_type: "branch".into(),
job_workflow_ref: format!(
"{owner_name}/{repository_name}/.github/workflows/{workflow_filename}@refs/heads/main"
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +8 to +9
static WORKFLOW_REF_RE: LazyLock<regex::Regex> =
LazyLock::new(|| regex::Regex::new(r"([^/]+\.(yml|yaml))(@.+)").unwrap());
Copy link
Member Author

@Turbo87 Turbo87 May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyPI uses a "lookahead match" pattern, but that does not appear to be supported by the regex crate. I've slightly adjusted the pattern to make it work. Since all the test cases from PyPI are passing I'm fairly confident in the change.

Comment on lines +9 to +18
/// The main implementation of the [`OidcKeyStore`] trait.
///
/// This struct fetches OIDC keys from a remote provider and caches them. If
/// a key is not found in the cache, it will attempt to refresh the cached
/// key set, unless the cache has just recently been refreshed.
pub struct RealOidcKeyStore {
issuer_uri: String,
client: reqwest::Client,
cache: RwLock<Cache>,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation roughly matches how PyPI caches the JWKS, though they save it in Redis with an expiration time, while we cache them in memory on each API server.

I hope the code comments make it clear enough how the implementation works.

use std::sync::LazyLock;

const PRIVATE_KEY_PEM: &[u8] = br#"
-----BEGIN PRIVATE KEY-----
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat surprised GitHub allowed me to push this 😂

As the code comment above says: DO NOT USE THESE IN PRODUCTION!!!1

@@ -37,6 +38,7 @@ fn main() -> anyhow::Result<()> {
.databases_from_config(&config.db)
.github(github)
.github_oauth_from_config(&config)
.trustpub_providers(&list("TRUSTPUB_PROVIDERS")?)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this essentially acts as a feature flag for each of the providers. it allows us to enable e.g. github on the staging environment, but not yet on production. once we add new providers it will also allow us to gradually enable them.

if a provider is breached it would also allow us to temporarily disable them.

@@ -186,6 +190,9 @@ impl Server {
.unwrap_or_default()
);

let domain_name = dotenvy::var("DOMAIN_NAME").unwrap_or_else(|_| "crates.io".into());
let trustpub_audience = var("TRUSTPUB_AUDIENCE")?.unwrap_or_else(|| domain_name.clone());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this allows us to configure the expected aud claim, if necessary. it defaults to the domain name (crates.io / staging.crates.io).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This may or may not be applicable in your case, but noting: PyPI intentionally chose not to use the domain here, since we thought we might want to change the authentication and/or exchange subdomain at some point. So PyPI uses pypi and TestPyPI uses testpypi.

(This probably doesn't matter in practice, just flagging as a very small difference! In practice, any identifier that serves as a domain separator is fine, DNS name or whatever else 🙂)

Comment on lines +21 to +32
/// Exchange an OIDC token for a temporary access token.
#[utoipa::path(
put,
path = "/api/v1/trusted_publishing/tokens",
request_body = inline(json::ExchangeRequest),
tag = "trusted_publishing",
responses((status = 200, description = "Successful Response", body = inline(json::ExchangeResponse))),
)]
pub async fn exchange_trustpub_token(
state: AppState,
json: json::ExchangeRequest,
) -> AppResult<Json<json::ExchangeResponse>> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case you didn't find it in the large diff: this is the main entrypoint of the new API endpoint 😉

Comment on lines +395 to +404
/// Add a new OIDC keystore to the application
pub fn with_oidc_keystore(
mut self,
issuer_url: impl Into<String>,
keystore: MockOidcKeyStore,
) -> Self {
self.oidc_key_stores
.insert(issuer_url.into(), Box::new(keystore));
self
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default, all tests start with no Trusted Publishing providers enabled, but this fn can be used to explicitly add them (with MockOidcKeyStore implementations).


/// Generate a new access token.
pub fn generate() -> Self {
Self::from_u64s(rand::random(), rand::random())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing, but two questions:

  1. Why two u64s instead of a single [u8; 16]? The per-call overhead for the RNG is minuscule, but a single call would eliminate it and would prevent any implied separation between the halves of the uniformly random token.
  2. Does crates.io have a forking/multi-process model? If so rand::random() might be risky -- I believe it uses ThreadRng by default, which isn't guaranteed to be self-healing over process forks (i.e., it needs to be explicitly reseeded to prevent duplicated randomness between forks). If there are multiple processes at play that can call this generate() function, then my recommendation would be to use OsRng instead.

(The historical reason for not using OsRng is performance concerns around the system calls it can incur, but on Linux getrandom has been in the vDSO since 6.11.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does crates.io have a forking/multi-process model?

unless I'm misunderstanding our application, crates.io only uses a single process, but then has multiple worker threads inside the process. since we don't have a GIL in Rust, or are generally single-threaded like in JS, there is usually no need for multiple processes.

Why two u64s instead of a single [u8; 16]?

no real reason actually. it seemed easiest to get to the hex output from there, but I'm totally open to alternatives 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless I'm misunderstanding our application, crates.io only uses a single process, but then has multiple worker threads inside the process. since we don't have a GIL in Rust, or are generally single-threaded like in JS, there is usually no need for multiple processes.

Gotcha, thanks! The concern with rand::random is only when there are multiple processes, not multiple threads, so that API should be fine 🙂

no real reason actually. it seemed easiest to get to the hex output from there, but I'm totally open to alternatives 😅

It's an extreme nitpick from me so feel free to ignore! But from a type/domain coherence perspective I'd recommend [u8; 16] so that it's obvious that there's no difference between the "halves" of the uniformly random token 🙂

let new_token = AccessToken::generate();

let new_token_model = NewToken {
expires_at: chrono::Utc::now() + chrono::Duration::minutes(30),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will 30 minutes be long enough for cargo builds to complete? I know there was some discussion around this.
rust-lang/rfcs#3691 (comment)

Would it make sense to make this configurable via an environment variable to make it easy for us to adjust?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to make this configurable via an environment variable to make it easy for us to adjust?

I could make it configurable, though I'm wondering whether it would be better to accept it as a request parameter. I guess 30min is probably fine for this MVP though? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a datapoint: we currently do 15 minutes for PyPI, but all of you are eminently more qualified to know what makes more sense here!

https://github.com/pypi/warehouse/blob/c92eaf2777cb26586f2817136e80037c6697e55c/warehouse/oidc/views.py#L288

(Whatever expiry you'd pick, I'd recommend not making it client configurable -- IMO TP's benefit is easiest to justify when expiries are consistent, i.e. users can't configure their way into a longer-lived credential by accident.)


/// Exchange an OIDC token for a temporary access token.
#[utoipa::path(
put,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be compliant with RFC 8693, this should be done via a POST.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like that RFC requires quite a few other things like using application/x-www-form-urlencoded (which PyPI also doesn't do). I wasn't aware that we were supposed to follow that RFC? 🤔

the rest of the API generally uses PUT to create resources and POST for editing existing resources, which is why I chose PUT here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't believe PyPI follows that RFC either (at least, I didn't follow it while implementing Trusted Publishing 😅).

IMO the state/parameters used here are pretty distinct from the standard OAuth2 Token Exchange anyways, so deviating here seems like not a huge deal.

Comment on lines +6 to +8
pub struct ExchangeRequest {
pub jwt: String,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to expand and adjust this to be inline with RFC 8693.

The jwt here in the original implementation will be the subject_token in the spec.

Here's what I'd expect an example token exchange request to look like (a lot of implementations also support JSON content types here too):

POST /api/v1/trusted_publishing/tokens
Content-Type: application/x-www-form-urlencoded

grant_type=urn:ietf:params:oauth:grant-type:token-exchange&
subject_token=eyJhbGciOiJSUzI1NiIs...github-oidc-id-token...&
subject_token_type=urn:ietf:params:oauth:token-type:id_token&
audience=crates.io&
requested_token_type=urn:ietf:params:oauth:token-type:access_token

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants