-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal tagging #377
base: main
Are you sure you want to change the base?
internal tagging #377
Conversation
c10ace8
to
c122f10
Compare
"project_id": self.dataset.id, | ||
"tag_id": tag_id, | ||
}, | ||
).json()["is_complete"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling for the GET request in is_tag_complete
. Currently, it directly calls .json()
without checking status codes.
wait_start = time.time() | ||
# Wait up to 5 minutes for tag to be completed | ||
while not is_complete: | ||
# Sleep 5 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment mentions sleeping 5 seconds, but the code uses time.sleep(15)
. Please adjust the comment or the sleep duration for consistency.
# Sleep 5 seconds | |
# Sleep 15 seconds |
@@ -679,11 +680,14 @@ def __init__(self, projection: "AtlasProjection", auto_cleanup: Optional[bool] = | |||
self.auto_cleanup = auto_cleanup | |||
|
|||
@property | |||
def df(self, overwrite: bool = False) -> pd.DataFrame: | |||
def df(self, overwrite: bool = False, wait_time: int = 120) -> pd.DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using a @property
with parameters. Consider converting df
to a regular method so that parameters like overwrite
and wait_time
can be explicitly passed.
nomic/data_operations.py
Outdated
if is_complete: | ||
keep_tags.append(tag) | ||
else: | ||
# Use robotag route instead of v1/n so we guarantee only one request gets launched | ||
requests.post(self.dataset.atlas_api_path + "/v1/project/projection/tags/robotag", headers=self.dataset.header, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The POST request to the /v1/project/projection/tags/robotag
endpoint lacks error handling. Consider checking the response status and handling failures appropriately.
c122f10
to
075d630
Compare
@@ -708,11 +712,25 @@ def df(self, overwrite: bool = False) -> pd.DataFrame: | |||
tbs.append(tb) | |||
return pa.concat_tables(tbs).to_pandas() | |||
|
|||
def get_tags(self) -> List[Dict[str, str]]: | |||
def is_tag_complete(self, tag_id) -> bool: | |||
is_complete = requests.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling (e.g. response.raise_for_status
) in is_tag_complete
to gracefully handle HTTP errors.
if is_complete: | ||
keep_tags.append(tag) | ||
else: | ||
# Use robotag route instead of v1/n so we guarantee only one request gets launched | ||
requests.post( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling for the robotag POST request to ensure it succeeds before proceeding.
) | ||
wait_start = time.time() | ||
# Wait up to 5 minutes for tag to be completed | ||
while not is_complete: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an inconsistency in the get_tags
method in the AtlasMapTags
class. The inline comment mentions sleeping for 5 seconds and waiting up to 5 minutes for a tag to be completed, but the code actually calls time.sleep(15)
and uses a default wait_time
of 120 seconds. Please update either the code or the comments so that they both reflect the intended behavior.
Important
Adds tag completion waiting mechanism to
AtlasMapTags
and updates version to 3.4.2.wait_time
parameter todf
andget_tags
inAtlasMapTags
to specify max wait time for tag completion.is_tag_complete
inAtlasMapTags
to check tag completion status./v1/project/projection/tags/robotag
endpoint to ensure tag processing.setup.py
from3.4.1
to3.4.2
.This description was created by
for 075d630. It will automatically update as commits are pushed.