Conversation
|
@Tmonster Do you want to review this ? |
…T body credentials for Google compatibility. Add tests to validate behavior for refresh_token and client_credentials grants.
|
You can also give a try by using this commands Extract Your Credentials from gcloud ADC file: CLIENT_ID=$(cat ~/.config/gcloud/application_default_credentials.json | jq -r '.client_id')
CLIENT_SECRET=$(cat ~/.config/gcloud/application_default_credentials.json | jq -r '.client_secret')
REFRESH_TOKEN=$(cat ~/.config/gcloud/application_default_credentials.json | jq -r '.refresh_token')LOAD iceberg;
-- Create secret with refresh_token for automatic refresh
CREATE SECRET biglake_auto_refresh (
TYPE ICEBERG,
CLIENT_ID '<your-client-id>',
CLIENT_SECRET '<your-client-secret>',
REFRESH_TOKEN '<your-refresh-token>',
OAUTH2_SERVER_URI 'https://oauth2.googleapis.com/token',
EXTRA_HTTP_HEADERS MAP {'x-goog-user-project': '<your-project-id>'}
);
-- Attach to BigLake
ATTACH 'gs://biglake-public-nyc-taxi-iceberg' AS biglake_public (
TYPE ICEBERG,
SECRET biglake_auto_refresh,
ENDPOINT 'https://biglake.googleapis.com/iceberg/v1/restcatalog'
);
-- Query data - token will auto-refresh when needed!
SELECT COUNT(*) FROM biglake_public.public_data.nyc_taxicab; |
|
Just additional context. I had to change Original code (line 193): doc = std::unique_ptr<yyjson_doc, YyjsonDocDeleter>(ICUtils::api_result_to_doc(response->body));The old code was using an Iceberg catalog API parser for OAuth2 token responses that matches Iceberg catalog APIs (e.g. OAuth2 token endpoints return successful responses like: { "access_token": "...", "token_type": "Bearer", "expires_in": 3600 }And per RFC 6749 §5.2, OAuth2 errors look like: { "error": "invalid_scope", "error_description": "..." }Notice error is a string, not an object. So when api_result_to_doc does yyjson_obj_get(error, "message"), it returns NULL, and we end up throwing "No message available" which is super misleading and hides the real OAuth2 error. Switching to: auto *doc = yyjson_read(response->body.c_str(), response->body.size(), 0);means we’re just parsing JSON without imposing Iceberg catalog semantics on it. Then OAuthTokenResponse::FromJSON can validate the token response in the correct OAuth2-aware way and surface meaningful errors. |
Tmonster
left a comment
There was a problem hiding this comment.
Thanks!
I will try and take a closer look tomorrow, today my schedule is pretty busy.
Tmonster
left a comment
There was a problem hiding this comment.
Thanks! a couple of comments.
Also, do you know of any cloud based catalogs I could test this against? I'm guessing BigLake supports this?
src/storage/authorization/oauth2.cpp
Outdated
| post_data); | ||
| } catch (HTTPException &http_ex) { | ||
| // APIUtils::Request throws HTTPException for non-2xx responses. | ||
| throw InvalidConfigurationException("Could not get token from %s: HTTP error: %s", uri, http_ex.what()); |
There was a problem hiding this comment.
APIUtils actually doesn't throw for non-2XX responses. 400 responses can be parsed via the response object so you can inspect codes and error messages directly
| // Compute expiry time if we have both token and expires_in | ||
| if (!token.IsNull() && expires_in_seconds > 0) { | ||
| // Pass empty string for refresh_token to preserve the one we just set | ||
| result->UpdateTokenState(token.ToString(), expires_in_seconds, ""); |
There was a problem hiding this comment.
Accoring to src/storage/authorization/oauth2.cpp:630 we should have a lock when calling UpdateTokenState. Seems Seems like we don't have it here?
There was a problem hiding this comment.
No lock needed here. we're still constructing this object and haven't shared it with other threads yet. UpdateTokenState is safe to call during construction. Once this object is returned and shared, UpdateTokenState will only be called from RefreshAccessTokenUnlocked which requires holding token_mutex.
src/storage/authorization/oauth2.cpp
Outdated
|
|
||
| string OAuth2Authorization::GetToken(ClientContext &context, const string &grant_type, const string &uri, | ||
| const string &client_id, const string &client_secret, const string &scope) { | ||
| namespace { |
There was a problem hiding this comment.
nit: Can you merge with the other namespace above?
| ATTACH '' AS quick_expiry_lake ( | ||
| TYPE ICEBERG, | ||
| CLIENT_ID 'client1', | ||
| CLIENT_SECRET 's3cr3t', |
There was a problem hiding this comment.
Should we also pass an expires_in here?
There was a problem hiding this comment.
expires_in comes from the OAuth2 server response. No need to add expires_in her. it's automatically retrieved from the OAuth2 server response when using client credentials!
|
Hi @Tmonster Thank you for reviewing. Yes Google Could implement this RFC standard. You can use Biglake to test it. I gave instruction here. If you dont have credentail file
It should be under here: Linux, macOS: |
|
Ah yes, thanks. Will do that then. Also working on fixing the Polaris tests, will post here when they are fixed |
…g a lock for token state checks and refresh operations. Addresed comments on PR
OAuth2's refresh token flow (https://datatracker.ietf.org/doc/html/rfc6749#section-6) is not supported yet in iceberg extension.
This PR adds RFC 6749 refresh token grant support to DuckDB’s Iceberg extension OAuth2 flow. The Iceberg extension previously supported acquiring access tokens but did not implement the refresh path, causing long-running sessions to fail once the access token expired.
Refresh token rotation is tracked in-memory only. Rotated refresh tokens are not persisted back into DuckDB secrets; after DETACH/re-ATTACH, the original refresh token is used.