-
Notifications
You must be signed in to change notification settings - Fork 7
Feature/OIDC device flow #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/OIDC device flow #122
Conversation
|
@jhiemstrawisc The code needed for a Jupyter cell to work: import getpass
import sys
from io import StringIO
# Capture the password securely
password = getpass.getpass("Enter password to encrypt credentials file: ")
# Redirect stdin to provide the password
old_stdin = sys.stdin
sys.stdin = StringIO(password + "\n")
try:
list = pelfs.ls("/ospool/ap40/data/emma.turetsky")
print(list)
finally:
sys.stdin = old_stdin |
jhiemstrawisc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to source code review, I tried testing this by hand as a user might:
>>> pelfs = PelicanFileSystem("pelican://osg-htc.org")
>>> pelfs.ls('/ospool/ap40/justin.hiemstra/')
The client is able to save the authorization in a local file.
This prevents the need to reinitialize the authorization for each transfer.
You will be asked for this password whenever a new session is started.
Please provide a new password to encrypt the local OSDF client configuration file:
asdf
To approve credentials for this operation, please navigate to the following URL and approve the request:
https://osdf-ospool-issuer.osgdev.chtc.io/scitokens-server/device?user_code=GWT-LMC-2HM
WARNING[2025-11-17T15:03:14-06:00] Token was acquired from issuer but it does not appear valid for transfer; trying anyway
eyJraWQiOiI1ZTQ5IiwidHlwIjoiSldUIiwiYWxnIjoiUlMyNTYifQ.eyJ3bGNnLnZlciI6IjEuMCIsImF1ZCI6Imh0dHBzOi8vd2xjZy5jZXJuLmNoL2p3dC92MS9hbnkiLCJzdWIiOiJqdXN0aW4uaGllbXN0cmEiLCJuYmYiOjE3NjM0MTMzODksInNjb3BlIjoic3RvcmFnZS5yZWFkOi9kYXRhL2p1c3Rpbi5oaWVtc3RyYSIsImlzcyI6Imh0dHBzOi8vb3NnLWh0Yy5vcmcvb3Nwb29sIiwiZXhwIjoxNzYzNDE0NDAzLCJpYXQiOjE3NjM0MTMzOTQsImp0aSI6Imh0dHBzOi8vb3NkZi1vc3Bvb2wtaXNzdWVyLm9zZ2Rldi5jaHRjLmlvL3NjaXRva2Vucy1zZXJ2ZXIvNmZmMTE0ZWRkNWE0MzllMTNkYTc4YzNmNzJkZThlOGY_dHlwZT1hY2Nlc3NUb2tlbiZ0cz0xNzYzNDEzMzk0MjM4JnZlcnNpb249djIuMCZsaWZldGltZT0xMDA5MDAwIn0.<Justin redacted this bit>
Using fallback token even though it may not be fully acceptable
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/jhiemstra/.local/share/mamba/envs/test-pelicanfs/lib/python3.12/site-packages/fsspec/asyn.py", line 118, in wrapper
return sync(self.loop, func, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jhiemstra/.local/share/mamba/envs/test-pelicanfs/lib/python3.12/site-packages/fsspec/asyn.py", line 103, in sync
raise return_result
File "/home/jhiemstra/.local/share/mamba/envs/test-pelicanfs/lib/python3.12/site-packages/fsspec/asyn.py", line 56, in _runner
result[0] = await coro
^^^^^^^^^^
File "/home/jhiemstra/snakemake-executor-dev/pelicanfs/src/pelicanfs/core.py", line 711, in wrapper
return await func(self, data_url, *args[1:], **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jhiemstra/snakemake-executor-dev/pelicanfs/src/pelicanfs/core.py", line 723, in _ls
out = await self._ls_real(path, detail=detail)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jhiemstra/snakemake-executor-dev/pelicanfs/src/pelicanfs/core.py", line 751, in _ls_real
return await self._ls_real(url, detail=detail, client=client_ctx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jhiemstra/snakemake-executor-dev/pelicanfs/src/pelicanfs/core.py", line 760, in _ls_real
items = await list_files(remote_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jhiemstra/.local/share/mamba/envs/test-pelicanfs/lib/python3.12/site-packages/aiowebdav2/client.py", line 344, in list_with_infos
response = await self.execute_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jhiemstra/.local/share/mamba/envs/test-pelicanfs/lib/python3.12/site-packages/aiowebdav2/client.py", line 264, in execute_request
raise AccessDeniedError(self._url)
aiowebdav2.exceptions.AccessDeniedError: Access denied to https://ap40.uw.osg-htc.org:8443
I'm not too worried about the 403 I'm getting (I think this is external to PelicanFS), but I am worried that:
- The password I enter for the local token cache shows up in plain text
- The token itself is written out
Can you find a way to prevent these two things from happening?
- Add OIDC_DEVICE_FLOW to TokenDiscoveryMethod enum - Add operation and destination_url fields to TokenContentIterator dataclass - Update docstring to document OIDC device flow as final fallback method - Add Optional type import for new optional fields This prepares the token discovery system to support OIDC/OAuth device flow authentication via the pelican binary as a final fallback when all other token sources have been exhausted.
- Implement _pelican_binary_exists() to check for pelican binary in PATH - Implement _get_pelican_flag() to map TokenOperation to command flags - Implement _get_token_from_pelican_binary() to invoke pelican binary
- Add OIDC_DEVICE_FLOW case to __next__() method as final fallback - Check for pelican binary existence before attempting device flow - Display helpful warning if binary not found - Invoke device flow and return token if successful - Raise StopIteration if binary missing or token acquisition fails
- Pass operation and destination_url to TokenContentIterator initialization - Enables OIDC device flow to request appropriate permissions (read/write) - Provides destination URL required for pelican binary invocation
- Add subprocess import for testing process invocation - Add 12 new test functions covering all OIDC device flow scenarios: * Binary existence checking * Successful token acquisition with various output formats * Warning prefix handling * Read/write operation flag mapping * Error handling (binary failures, missing tokens, timeouts) * Edge cases (no destination URL, missing binary).
The pelican binary requires pelican:// protocol URLs instead of HTTP URLs. Changes: - Add pelican_url parameter to TokenGenerator and TokenContentIterator - Make _handle_token_generation() async to call await self._set_director_url() - Extract federation host from director URL and construct pelican://<federation-host>/<path> - Update _get_token_from_pelican_binary() to use pelican_url instead of destination_url - Update all call sites to use await (in async methods) or sync() (in sync methods)
…flow The pelican binary needs a terminal to run interactively and handle OIDC device flow prompts (including optional password prompts for config encryption). Changes: - Create a pseudo-terminal for subprocess - Implement bidirectional I/O with select.select() to forward stdin to subprocess - Echo subprocess output to terminal while capturing for token extraction - Extract read_and_echo_output() helper function to reduce duplication - Update all OIDC device flow tests to mock pseudo-terminal operations
When running in Jupyter or IPython environments, sys.stdout may be replaced with a custom object that doesn't have a .buffer attribute. This caused an AttributeError when trying to echo PTY output. Fix: Add try/except to fall back to text mode (decode and write to stdout) when buffer attribute is not available.
When sys.stdin is redirected in Jupyter/IPython (e.g., using StringIO for password input), it doesn't support fileno() which caused select.select() to fail with io.UnsupportedOperation. Fix: Check if stdin supports fileno() before including it in select monitoring. If not available, only monitor the PTY master_fd. The subprocess can still read password input through the PTY even without bidirectional forwarding.
When stdin is redirected (e.g., StringIO in Jupyter), read all available data from it and write it to the PTY master_fd. This allows password input to work when stdin is programmatically redirected. The data is sent during each iteration of the select loop until consumed, then cleared to avoid repeated sends.
7c704db to
528a590
Compare
-- Fixed security issue (passwords and tokens redacted in terminal output) -- Fixed token generation to use discovery_url instead of director_url -- Made timeouts and buffer sizes configurable -- Added tentative windows compatibilitys with oidc devicde flow -- Added a helper method to help with potential test fragility -- Now uses debug flag in pelican binary call when debug logging is enabled in pelicanfs -- Added pelican binary installation link to warning message -- Changed final call to continued instead of explicitly raising stop iteration in token content iterations -- Added debug log showing PATH when checking for pelican binary
|
@jhiemstrawisc I've tested this using a Jupyter notebook and with the command line - I believe I've addressed your concerns. |
jhiemstrawisc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting pretty close. The only issue I'm still having is with redacting stuff the input password to pelican token fetch:
$ python
Python 3.12.12 | packaged by conda-forge | (main, Oct 22 2025, 23:34:53) [Clang 19.1.7 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Cmd click to launch VS Code Native REPL
>>> from pelicanfs import PelicanFileSystem
>>> pelfs = PelicanFileSystem("pelican://osg-htc.org")
>>> pelfs.ls('/ospool/ap40/data/justin.hiemstra/')
The OSDF client configuration is encrypted. Enter your password: [REDACTED]
asdf
[TOKEN_REDACTED]
Using fallback token even though it may not be fully acceptable
[{'name': '/ospool/ap40/data/justin.hiemstra/ap40-foo.txt', 'size': 9, 'type': 'file', 'modified': datetime.datetime(2025, 11, 10, 16, 47, 45)}]
You'll notice Enter your password: [REDACTED] -- this shows up before I've started typing, but then my super secret asdf password still shows up in plain text. I think this is critical enough to get right on the first pass of the code.
jhiemstrawisc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Lots of little things in here to test, but things "work on my machine"
Implements a call to the pelican binary to run an OIDC flow to get a token for the given namespace.