-
Notifications
You must be signed in to change notification settings - Fork 182
RUST-2235 Implement GSSAPI auth support for Linux and macOS #1413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@abr-egn I am not able to add other reviewers, so if you know who else could/should be tagged on this, can you add them? Thank you! |
Assigned |
Ok I believe I've addressed the majority of failures now. The two remaining issues are:
Beyond addressing those two issues, the last thing I need to do is double check the spec for how to handle password being provided on Windows. It looks like there is supposed to be some special handling that happens only on Windows if the password is provided as a uri option. I'll look into that and make any relevant changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once everything is working on every OS/tests are passing. Good design. Just fix the format!s because clippy will start complaining whenever they update.
Cargo.toml
Outdated
@@ -58,6 +58,9 @@ gcp-oidc = ["dep:reqwest"] | |||
# This can only be used with the tokio-runtime feature flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, really (though we could fix it here), should we update this comment since async-std support was dropped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, yes, good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the update should be here. Do I just delete this comment on line 58? Or change it to say something else (if so, what should it say)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, yup, just delete the comment. The driver only supports tokio now so the there's nothing that needs to be documented about this feature.
} | ||
|
||
if credential.source.as_deref().unwrap_or("$external") != "$external" { | ||
return Err(ErrorKind::InvalidArgument { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, just noticed that the "must be $external" message for OIDC is slightly different than everything else, maybe we should update it 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 👍
src/client/auth/gssapi.rs
Outdated
hostname: &str, | ||
) -> Result<(Self, Vec<u8>)> { | ||
let service_name: &str = properties.service_name.as_ref(); | ||
let mut service_principal = format!("{}/{}", service_name, hostname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let mut service_principal = format!("{}/{}", service_name, hostname); | |
let mut service_principal = format!("{service_name}/{hostname}"); |
Let's just fix this before clippy starts whining about it the next time they update it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated all 👍
src/client/auth/gssapi.rs
Outdated
let service_name: &str = properties.service_name.as_ref(); | ||
let mut service_principal = format!("{}/{}", service_name, hostname); | ||
if let Some(service_realm) = properties.service_realm.as_ref() { | ||
service_principal = format!("{}@{}", service_principal, service_realm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
service_principal = format!("{}@{}", service_principal, service_realm); | |
service_principal = format!("{service_principal}@{service_realm}"); |
ditto
src/client/auth/gssapi.rs
Outdated
// If no SERVICE_REALM was specified, use realm specified in the | ||
// username. Note that `realm` starts with '@'. | ||
let (_, realm) = user_principal.split_at(idx); | ||
service_principal = format!("{}{}", service_principal, realm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
service_principal = format!("{}{}", service_principal, realm); | |
service_principal = format!("{service_principal}{realm}"); |
@mattChiaravalloti all the failures atm seem to be from missing libclang, it may make sense to turn off the gssapi feature flag for those tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very reasonable!
The MSRV issue can be solved by just bumping up MSRV to 1.82 which introduced the unsafe extern
block syntax; that's well past our six month rolling window.
I agree with Patrick that it makes sense to turn off the GSSAPI feature flag for the Graviton variants; I don't think there's enough value in the marginal test coverage increase to justify the engineer time to figure out if it can be made to work.
Cargo.toml
Outdated
@@ -58,6 +58,9 @@ gcp-oidc = ["dep:reqwest"] | |||
# This can only be used with the tokio-runtime feature flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, yes, good catch.
Cargo.toml
Outdated
derive_more = "0.99.17" | ||
derive-where = "1.2.7" | ||
dns-lookup = { version = "2.0", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use hickory-resolver
which we already have as an optional dep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't even realize we had that imported at first to be honest. Just updated to use it. The interface is a bit clunkier than dns-lookup
but at the end of the day I think it's preferable to use the same crate. Thanks for pointing this out.
… the other auth mechs
…nd introduce GSSAPI-specific variants, tasks, and functions to evergreen config
Thanks! I believe I updated 1.81 -> 1.82 in all appropriate places. Time to see if it works as expected 🤞
Amazing, love when this is a solution. Given this, and the upcoming end-to-end GSSAPI auth testing coming up, I refactored |
@abr-egn I discovered that cross-krb5 is not quite as "cross-platform" as I initially believed: it does not work on Windows the way we need it to. It either requires that the Kerberos credentials be the same as the currently logged-in Windows user (not an assumption we could/should make), or that the Windows client install MIT Kerberos. That wouldn't be the worst thing to require but also isn't an ideal user experience. I started trying to write an alternative implementation for Windows using the originally planned I talked to Natacha and the rest of my team and we decided I'll split this up. This ticket/PR will just cover GSSAPI auth support for Linux and Mac. I'll also only cover end-to-end testing for those two in the next ticket I have in progress, RUST-2236. I'll file a follow-up ticket (in the epic) to implement and e2e test GSSAPI auth support on Windows. I'll pick that one up a bit later, though, since I've been struggling with the kerberos API for about 3 weeks now and need to step away from it a bit to catch my breath. Just an FYI for why Windows support wasn't done as part of this PR! |
src/client/auth/gssapi.rs
Outdated
return Ok(hostname.to_string()); | ||
} | ||
|
||
let resolver = TokioAsyncResolver::tokio_from_system_conf().map_err(|e| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should use ClientOptions::resolver_config
, so that'll likely need to be added to HandshakerOptions
and plumbed through in EstablisherOptions::from_client_options
and Credential::authenticate_stream
.
If you're feeling ambitious it's feeling like it's about time for an AuthenticateOptions
struct that bundles together the non-Connection
arguments to authenticate_stream
and has a From<ClientOptions>
impl so it can just get passed through Handshaker
without having to update all the call sites each time new data like this is needed, but absolutely reasonable if you'd rather not include that in this PR 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, I only piped through resolver_config
so that it makes it all the way down to canonicalize_hostname
. Let me know if you think it all looks good since it involved touching a big number of files. I decided to go with [cfg(any(feature = "dns-lookup", feature = "gssapi-auth"))]
in the places where it's possible that only gssapi-auth
is enabled without explicitly enabling dns-lookup
(and vice versa). Curious to see what you think about that decision in particular.
Also, I was considering how to best introduce the AuthenticateOptions
idea you proposed, but I'm a bit confused what you're envisioning. You suggested having it implement From<ClientOptions>
, but where would those ClientOptions
come from? By the time we call authenticate_stream
we're already 2 steps removed from ClientOptions
: it goes ClientOptions
--into--> HandshakerOptions
--into--> Handshaker
, and then Handshaker
calls authenticate_stream
. Are you suggesting that Handshaker
be updated to contain an AuthenticateOptions
struct field instead of all the non-connection individual fields it currently has? Would that same thing apply to HandshakerOptions
? Maybe it would be best to leave this to someone on your team in a follow-up ticket since you guys have better expertise on the design practices you use in this repo!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, I only piped through
resolver_config
so that it makes it all the way down tocanonicalize_hostname
. Let me know if you think it all looks good since it involved touching a big number of files. I decided to go with[cfg(any(feature = "dns-lookup", feature = "gssapi-auth"))]
in the places where it's possible that onlygssapi-auth
is enabled without explicitly enablingdns-lookup
(and vice versa). Curious to see what you think about that decision in particular.
I think the right thing here is for the 'gssapi-auth' feature to require 'dns-lookup', not 'dep:hickory-resolver' (after all, it doesn't really care how the dns resolution happens). That would mean all the existing places that are conditioned on dns-lookup
can continue to be conditioned on just that and don't need to change, and anything added for GSSAPI support specifically can be conditioned on just #[cfg(feature = "gssapi-auth")]
.
Also, I was considering how to best introduce the
AuthenticateOptions
idea you proposed, but I'm a bit confused what you're envisioning. You suggested having it implementFrom<ClientOptions>
, but where would thoseClientOptions
come from? By the time we callauthenticate_stream
we're already 2 steps removed fromClientOptions
: it goesClientOptions
--into-->HandshakerOptions
--into-->Handshaker
, and thenHandshaker
callsauthenticate_stream
. Are you suggesting thatHandshaker
be updated to contain anAuthenticateOptions
struct field instead of all the non-connection individual fields it currently has? Would that same thing apply toHandshakerOptions
? Maybe it would be best to leave this to someone on your team in a follow-up ticket since you guys have better expertise on the design practices you use in this repo!
Agreed, definitely better for one of us in a follow-up ticket, there's a lot of context here 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smart, yes that's a better solution. I'll work on that change and get it back in shortly. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abr-egn It is all updated appropriately now 👍 Do you want me to file the follow-up ticket for AuthenticationOptions
, or do you want to file it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed RUST-2247; it'll make a nice capper for the end of spec fest 😁
MacOS hosts are pretty busy now so those might take a while. They've been sporadically failing the last few commits. Hopefully those are just transient errors that don't need to be addressed as part of this PR. What's more concerning was |
There's an uncommon but persistent flaky failure in that test (example) that we haven't been able to nail down, I suspect that's what you ran into. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR implements the GSSAPI auth mechanism, as described by the spec. This took a bit longer than anticipated because manual end-to-end testing revealed that the initial implementation was not correct. Subsequent fixes made slow progress towards correctness. Ultimately, the implementation you see here has been manually tested on macOS (locally, against a local KDC and mongod) and an ubuntu2204-arm64-small evergreen host (against the LDAPTEST KDC/mongod). I still need to test this on Windows. I have a separate follow-up ticket to implement automated end-to-end testing, RUST-2236, which I will put up a PR for after confirming this works on Windows and actually writing end-to-end tests.
This implementation is guarded by a feature flag, and passes the existing spec tests when the feature flag is enabled.