Skip to content

Conversation

@gdiepen
Copy link
Contributor

@gdiepen gdiepen commented Mar 4, 2025

In the original setup you have a configuration option OIDC_GROUP_NAME which provides a comma separated list of groups that are considered relevant.

Every new group that is added to AD that would allow access to MLFlow would require a restart of the mlflow server with an updated configuration.

Instead, if we use patterns like mlflow-users-* as an allowed pattern, it would indicate that any group starting withmlflow-users- would be a group that gives access to mlflow.

One additional benefit is that if the okta function returns a large number of groups (e.g. 300 different groups that a user is a member of), only the groups considered relevant based on the patterns are added to the permission databases, the rest is ignored.

This PR modifies the code to not work with a hardcoded list of full names, but with a list of patterns instead.

Decided to not use full regular expressions here, but simple file-system like matching using the fnmatch library in python.

Implements: #78

@gdiepen
Copy link
Contributor Author

gdiepen commented Mar 4, 2025

I am currently working on the unit tests. Having some problems with the fact that the AppConfig object is evaluated only once, trying to work around this.

Also needed to refactor the token functions into a separate util module because otherwise I ended up with circular import errors during the unit tests....

@gdiepen
Copy link
Contributor Author

gdiepen commented Mar 7, 2025

@kharkevich This feature is fully implemented now, including the test functions

@gdiepen gdiepen force-pushed the feature/use_patterns_to_filter_allowed_groups branch from f023bff to b3879db Compare April 29, 2025 09:56
@grudloffev
Copy link

Hi, I believe this PR is really useful, however I don't think it's a good idea to break backwards compatibility by changing the names

@gdiepen
Copy link
Contributor Author

gdiepen commented Apr 29, 2025

I explicitly did not want to re-use the existing name because the logic is very different (in essence it is a filter instead of just a list).

Let me know what you think is the best solution and happy to modify the code.

@grudloffev
Copy link

hmm so it only allows filters? Wouldn't it make sense to support both in the same list and handle it accordingly?

FYI I am not a maintainer, so just voicing my personal opinion here.

@gdiepen
Copy link
Contributor Author

gdiepen commented Apr 29, 2025

Technically, using a full name (like in the original environment variable OIDC_GROUP_NAME) is also a filter and would do exactly the same as what it currently does. So I could re-use the same OIDC_GROUP_NAME variable also

Maybe one of the developers can share their thoughts on what they would like?

self.OIDC_USERS_DB_URI = os.environ.get("OIDC_USERS_DB_URI", "sqlite:///auth.db")
self.OIDC_GROUP_NAME = [group.strip() for group in os.environ.get("OIDC_GROUP_NAME", "mlflow").split(",")]
self.OIDC_GROUP_FILTER_PATTERNS = [
pattern.strip() for pattern in os.environ.get("OIDC_GROUP_FILTER_PATTERNS", "*").split(",")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using str.split(",") here may break expressions that include commas. Would parsing a JSON array be more robust?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants