Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal tagging #377

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

internal tagging #377

wants to merge 1 commit into from

Conversation

rguo123
Copy link
Collaborator

@rguo123 rguo123 commented Mar 5, 2025

Important

Adds tag completion waiting mechanism to AtlasMapTags and updates version to 3.4.2.

  • Behavior:
    • Adds wait_time parameter to df and get_tags in AtlasMapTags to specify max wait time for tag completion.
    • Implements is_tag_complete in AtlasMapTags to check tag completion status.
    • Uses /v1/project/projection/tags/robotag endpoint to ensure tag processing.
    • Logs a warning if a tag is unavailable after waiting.
  • Misc:
    • Updates version in setup.py from 3.4.1 to 3.4.2.

This description was created by Ellipsis for 075d630. It will automatically update as commits are pushed.

Copy link
Collaborator Author

rguo123 commented Mar 5, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@rguo123 rguo123 marked this pull request as ready for review March 10, 2025 19:43
@rguo123 rguo123 force-pushed the 03-05-internal_tagging branch from c10ace8 to c122f10 Compare March 10, 2025 19:44
@rguo123 rguo123 requested a review from mcembalest March 10, 2025 19:46
"project_id": self.dataset.id,
"tag_id": tag_id,
},
).json()["is_complete"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling for the GET request in is_tag_complete. Currently, it directly calls .json() without checking status codes.

wait_start = time.time()
# Wait up to 5 minutes for tag to be completed
while not is_complete:
# Sleep 5 seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions sleeping 5 seconds, but the code uses time.sleep(15). Please adjust the comment or the sleep duration for consistency.

Suggested change
# Sleep 5 seconds
# Sleep 15 seconds

@@ -679,11 +680,14 @@ def __init__(self, projection: "AtlasProjection", auto_cleanup: Optional[bool] =
self.auto_cleanup = auto_cleanup

@property
def df(self, overwrite: bool = False) -> pd.DataFrame:
def df(self, overwrite: bool = False, wait_time: int = 120) -> pd.DataFrame:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using a @property with parameters. Consider converting df to a regular method so that parameters like overwrite and wait_time can be explicitly passed.

if is_complete:
keep_tags.append(tag)
else:
# Use robotag route instead of v1/n so we guarantee only one request gets launched
requests.post(self.dataset.atlas_api_path + "/v1/project/projection/tags/robotag", headers=self.dataset.header,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The POST request to the /v1/project/projection/tags/robotag endpoint lacks error handling. Consider checking the response status and handling failures appropriately.

@rguo123 rguo123 force-pushed the 03-05-internal_tagging branch from c122f10 to 075d630 Compare March 10, 2025 19:49
@@ -708,11 +712,25 @@ def df(self, overwrite: bool = False) -> pd.DataFrame:
tbs.append(tb)
return pa.concat_tables(tbs).to_pandas()

def get_tags(self) -> List[Dict[str, str]]:
def is_tag_complete(self, tag_id) -> bool:
is_complete = requests.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling (e.g. response.raise_for_status) in is_tag_complete to gracefully handle HTTP errors.

if is_complete:
keep_tags.append(tag)
else:
# Use robotag route instead of v1/n so we guarantee only one request gets launched
requests.post(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling for the robotag POST request to ensure it succeeds before proceeding.

)
wait_start = time.time()
# Wait up to 5 minutes for tag to be completed
while not is_complete:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an inconsistency in the get_tags method in the AtlasMapTags class. The inline comment mentions sleeping for 5 seconds and waiting up to 5 minutes for a tag to be completed, but the code actually calls time.sleep(15) and uses a default wait_time of 120 seconds. Please update either the code or the comments so that they both reflect the intended behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant