Skip to content

Conversation

@drewaogle
Copy link
Contributor

CSVLoader actually wants an RangeIndex, since everything wants to use start and only RangeIndex has it. It's better to catch it earlier.

I missed passing token to Configuration when loading from the config file, so if you save a aperturedb key, it will lose the token when it loads.

@drewaogle drewaogle requested review from bovlb and gsaluja9 July 10, 2025 23:52
@drewaogle drewaogle self-assigned this Jul 10, 2025
gsaluja9
gsaluja9 previously approved these changes Jul 11, 2025
Copy link
Collaborator

@gsaluja9 gsaluja9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@bovlb bovlb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

# most users don't supply their own df, so this is mostly a sanity check
# for when an advanced user has done filtering and have a IntervalIndex.
if not isinstance(self.df.index, pd.RangeIndex):
raise Exception(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise Exception(
raise TypeError(

f"CSVParser requires a RangeIndex. the supplied DataFrame has a {type(self.df.index)} index.")
else:
if df is not None:
raise Exception("Dask requires a csv passed as a filename")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise Exception("Dask requires a csv passed as a filename")
raise ValueError("Dask requires a csv passed as a filename")

use_rest=config["use_rest"],
use_ssl=config["use_ssl"])
use_ssl=config["use_ssl"],
token=config["token"] if "token" in config else None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
token=config["token"] if "token" in config else None)
token=config.get("token"),
)

@luisremis luisremis requested a review from Copilot July 12, 2025 01:32
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a sanity check for DataFrame indexes in the CSV parser and ensures the token field is preserved when loading configurations from file.

  • Enforce that CSVParser only accepts pandas DataFrames with a RangeIndex.
  • Pass the token value into the Configuration constructor when reading from the CLI config file.
  • Improve error handling around Dask input for CSVParser.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
aperturedb/cli/configure.py Include token argument when constructing Configuration from config.
aperturedb/CSVParser.py Validate DataFrame index is a RangeIndex; raise exceptions for invalid CSV/Dask inputs.
Comments suppressed due to low confidence (2)

aperturedb/CSVParser.py:69

  • Add unit tests to verify that providing a non-RangeIndex DataFrame raises the expected exception in CSVParser.
            if not isinstance(self.df.index, pd.RangeIndex):

aperturedb/cli/configure.py:79

  • Add tests for loading and saving configurations to confirm the token is correctly carried through and defaults to None when absent.
                token=config["token"] if "token" in config else None)

use_rest=config["use_rest"],
use_ssl=config["use_ssl"])
use_ssl=config["use_ssl"],
token=config["token"] if "token" in config else None)
Copy link

Copilot AI Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider using config.get("token") to simplify this expression and default to None when the key is missing.

Suggested change
token=config["token"] if "token" in config else None)
token=config.get("token", None))

Copilot uses AI. Check for mistakes.
# for when an advanced user has done filtering and have a IntervalIndex.
if not isinstance(self.df.index, pd.RangeIndex):
raise Exception(
f"CSVParser requires a RangeIndex. the supplied DataFrame has a {type(self.df.index)} index.")
Copy link

Copilot AI Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a more specific exception type like ValueError and capitalize the sentence in the message (e.g., "The supplied DataFrame...").

Copilot uses AI. Check for mistakes.
f"CSVParser requires a RangeIndex. the supplied DataFrame has a {type(self.df.index)} index.")
else:
if df is not None:
raise Exception("Dask requires a csv passed as a filename")
Copy link

Copilot AI Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the generic Exception with ValueError and clarify the message, for example: "Dask mode requires a CSV filename; DataFrame inputs are not supported."

Suggested change
raise Exception("Dask requires a csv passed as a filename")
raise ValueError("Dask mode requires a CSV filename; DataFrame inputs are not supported.")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants