-
Couldn't load subscription status.
- Fork 4
Small fixes to CSV df and Configuration token #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| # most users don't supply their own df, so this is mostly a sanity check | ||
| # for when an advanced user has done filtering and have a IntervalIndex. | ||
| if not isinstance(self.df.index, pd.RangeIndex): | ||
| raise Exception( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| raise Exception( | |
| raise TypeError( |
| f"CSVParser requires a RangeIndex. the supplied DataFrame has a {type(self.df.index)} index.") | ||
| else: | ||
| if df is not None: | ||
| raise Exception("Dask requires a csv passed as a filename") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| raise Exception("Dask requires a csv passed as a filename") | |
| raise ValueError("Dask requires a csv passed as a filename") |
| use_rest=config["use_rest"], | ||
| use_ssl=config["use_ssl"]) | ||
| use_ssl=config["use_ssl"], | ||
| token=config["token"] if "token" in config else None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| token=config["token"] if "token" in config else None) | |
| token=config.get("token"), | |
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a sanity check for DataFrame indexes in the CSV parser and ensures the token field is preserved when loading configurations from file.
- Enforce that CSVParser only accepts pandas DataFrames with a RangeIndex.
- Pass the
tokenvalue into the Configuration constructor when reading from the CLI config file. - Improve error handling around Dask input for CSVParser.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| aperturedb/cli/configure.py | Include token argument when constructing Configuration from config. |
| aperturedb/CSVParser.py | Validate DataFrame index is a RangeIndex; raise exceptions for invalid CSV/Dask inputs. |
Comments suppressed due to low confidence (2)
aperturedb/CSVParser.py:69
- Add unit tests to verify that providing a non-RangeIndex DataFrame raises the expected exception in CSVParser.
if not isinstance(self.df.index, pd.RangeIndex):
aperturedb/cli/configure.py:79
- Add tests for loading and saving configurations to confirm the
tokenis correctly carried through and defaults toNonewhen absent.
token=config["token"] if "token" in config else None)
| use_rest=config["use_rest"], | ||
| use_ssl=config["use_ssl"]) | ||
| use_ssl=config["use_ssl"], | ||
| token=config["token"] if "token" in config else None) |
Copilot
AI
Jul 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider using config.get("token") to simplify this expression and default to None when the key is missing.
| token=config["token"] if "token" in config else None) | |
| token=config.get("token", None)) |
| # for when an advanced user has done filtering and have a IntervalIndex. | ||
| if not isinstance(self.df.index, pd.RangeIndex): | ||
| raise Exception( | ||
| f"CSVParser requires a RangeIndex. the supplied DataFrame has a {type(self.df.index)} index.") |
Copilot
AI
Jul 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a more specific exception type like ValueError and capitalize the sentence in the message (e.g., "The supplied DataFrame...").
| f"CSVParser requires a RangeIndex. the supplied DataFrame has a {type(self.df.index)} index.") | ||
| else: | ||
| if df is not None: | ||
| raise Exception("Dask requires a csv passed as a filename") |
Copilot
AI
Jul 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace the generic Exception with ValueError and clarify the message, for example: "Dask mode requires a CSV filename; DataFrame inputs are not supported."
| raise Exception("Dask requires a csv passed as a filename") | |
| raise ValueError("Dask mode requires a CSV filename; DataFrame inputs are not supported.") |
CSVLoader actually wants an RangeIndex, since everything wants to use
startand only RangeIndex has it. It's better to catch it earlier.I missed passing token to Configuration when loading from the config file, so if you save a aperturedb key, it will lose the token when it loads.