-
Notifications
You must be signed in to change notification settings - Fork 228
[ENH] Replace os.path with Path for dataset directory handling (#3088) #3104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
CI Failures SummaryHi! @MatthewMiddlehurst @TonyBagnall The CI reports several failing jobs, but the errors seem unrelated to the changes in this PR. Details
ContextThis PR only updates dataset directory handling to use RequestCould a maintainer please confirm whether these CI test failures are expected on the current Python versions? Thank you! 🙏 |
|
The CI failures is common to all the PRs. It would be fixed. |
|
Was there an issue with this PR? |
Hi, thanks for following up @MatthewMiddlehurst I closed the previous PR because, after reviewing it again, I realised that the change I made (replacing The real problem is that downloaded TSC datasets are currently saved inside the package directory ( For the proper fix, I’m planning to move the default dataset storage location to a user-writable directory using Before I start the new PR, could you please confirm that I’ll proceed as soon as I have your confirmation. |
|
It would be better to edit your current PR rather than create a new one for changes like this. Rather than AppData it should probably be wherever the code is run from right? i.e. if you run If you don't mind answering, why did you initially only change the method to find the path? Just curious whether it was an problem with the issue. |
I’m happy to continue with the existing PR instead of opening a new one. I closed it only because I thought the fix was incomplete and possibly misleading. About the default location: I’m happy to implement this and I’ll update the PR accordingly. |
Initially, I interpreted the issue as primarily a problem with how the dataset directory was being resolved, not where it should live. The use of So my first instinct was to fix the path resolution method to reduce those inconsistencies. Only after digging deeper into the datasets module did I realize the real problem is not about resolution, but about location. So my first PR only improved the mechanics of path construction, not the logic of choosing the correct base directory. Once I connected that the issue is actually about selecting a new default storage root, I realized the initial fix wasn’t addressing the underlying problem so I stepped back to rethink the design. |
|
Interesting to hear that other packages put it in AppData. May be worth asking on Slack if there is a preference? I think for now if you want to edit the PR just do it locally as its an improvement over the current method, and we can change it later if necessary. |
[ENH] Replace os.path with Path for dataset directory handling (#3088)
What does this implement/fix?
This PR modernizes the dataset path-handling logic across the
datasetsmodule by replacing legacyos.pathusage withpathlib.Path.Updated files:
aeon/datasets/_data_loaders.pyaeon/datasets/_single_problem_loaders.pyaeon/datasets/dataset_collections.pySummary of changes
from pathlib import Pathto each updated file.os.path.dirname(aeon.__file__)os.path.join(...)Path(aeon.__file__).resolve().parent / "datasets"How this solves issue #3088
The previous implementation relied on
os.path.dirname(aeon.__file__), which may resolve to unexpected paths under editable installs, virtual environments, or certain packaging scenarios.By using
Path(aeon.__file__).resolve().parent, the resolved path always points to the actual installed module directory.This stabilizes dataset directory discovery and prevents issues where datasets are loaded from or written to incorrect locations.
Testing
pytest).Does your contribution introduce a new dependency?
No,
pathlibis part of the Python standard library.Any other comments?
This improves cross-platform reliability and aligns the codebase with modern Python best practices.
PR checklist
@all-contributorsafter merge.[ENH].