-
Notifications
You must be signed in to change notification settings - Fork 13
Added Aggregation Logic for 'learning_curves' Artifacts #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
adibiasio
wants to merge
11
commits into
autogluon:master
Choose a base branch
from
adibiasio:aggregate_learning_curves_artifacts
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 6 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
2f85b88
added aggregation logic for learning_curves artifacts
adibiasio bd53527
docstring edits
adibiasio b5e87f6
fixed linting errors
adibiasio 623c06e
fixed import linting errors
adibiasio 2159a4b
Merge branch 'master' of https://github.com/adibiasio/autogluon-bench…
adibiasio 01fb252
move fold number to folder name instead of json file name
adibiasio ae92ade
addressed pr comments
adibiasio 06bf520
removing redundant s3 utils
adibiasio 89fce00
update log message
adibiasio 1925e95
type hint updates
adibiasio 4a750c4
fixed doc string example
adibiasio File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| import logging | ||
|
|
||
| from autogluon.bench.eval.benchmark_context.output_suite_context import OutputSuiteContext | ||
| from autogluon.common.savers import save_pd | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def aggregate( | ||
| s3_bucket: str, | ||
| module: str, | ||
| benchmark_name: str, | ||
| artifact: str | None = "results", | ||
| constraint: str | None = None, | ||
| include_infer_speed: bool = False, | ||
| mode: str = "ray", | ||
| ) -> None: | ||
| """ | ||
| Aggregates objects across an agbenchmark. Functionality depends on artifact specified: | ||
|
|
||
| Params: | ||
| ------- | ||
| s3_bucket: str | ||
| Name of the relevant s3_bucket | ||
| module: str | ||
| The name of the relevant autogluon module: | ||
| can be one of ['tabular', 'timeseries', 'multimodal'] | ||
| benchmark_name: str | ||
| The name of the relevant benchmark that was run | ||
| artifact: str | ||
| The desired artifact to be aggregatedL can be one of ['results', 'learning_curves'] | ||
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| constraint: str | ||
| Name of constraint used in benchmark | ||
| include_infer_speed: bool | ||
| Include inference speed in aggregation. | ||
| mode: str | ||
| Can be one of ['seq', 'ray']. | ||
| If seq, runs sequentially. | ||
| If ray, utilizes parallelization. | ||
| """ | ||
| result_path = f"{module}/{benchmark_name}" | ||
| path_prefix = f"s3://{s3_bucket}/{result_path}/" | ||
| contains = f".{constraint}." if constraint else None | ||
|
|
||
| output_suite_context = OutputSuiteContext( | ||
| path=path_prefix, | ||
| contains=contains, | ||
| include_infer_speed=include_infer_speed, | ||
| mode=mode, | ||
| ) | ||
|
|
||
| if artifact == "learning_curves": | ||
| save_path = f"s3://{s3_bucket}/aggregated/{result_path}/{artifact}" | ||
| artifact_path = output_suite_context.aggregate_learning_curves(save_path=save_path) | ||
| else: | ||
| aggregated_results_name = f"results_automlbenchmark_{constraint}_{benchmark_name}.csv" | ||
| results_df = output_suite_context.aggregate_results() | ||
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| print(results_df) | ||
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| artifact_path = f"s3://{s3_bucket}/aggregated/{result_path}/{aggregated_results_name}" | ||
| save_pd.save(path=artifact_path, df=results_df) | ||
|
|
||
| logger.info(f"Aggregated output saved to {artifact_path}!") | ||
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,11 +1,112 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from urllib.parse import urlparse | ||
|
|
||
| import boto3 | ||
|
|
||
| from autogluon.common.loaders import load_s3 | ||
| from autogluon.common.utils import s3_utils | ||
|
|
||
|
|
||
| def get_s3_paths(path_prefix: str, contains=None, suffix=None): | ||
| def is_s3_url(path: str) -> bool: | ||
| """ | ||
| Checks if path is a s3 uri. | ||
|
|
||
| Params: | ||
| ------- | ||
| path: str | ||
| The path to check. | ||
|
|
||
| Returns: | ||
| -------- | ||
| bool: whether the path is a s3 uri. | ||
| """ | ||
| if (path[:2] == "s3") and ("://" in path[:6]): | ||
| return True | ||
| return False | ||
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| def get_bucket_key(s3_uri: str) -> tuple[str, str]: | ||
| """ | ||
| Retrieves the bucket and key from a s3 uri. | ||
|
|
||
| Params: | ||
| ------- | ||
| origin_path: str | ||
| The path (s3 uri) to be parsed. | ||
|
|
||
| Returns: | ||
| -------- | ||
| bucket_name: str | ||
| the associated bucket name | ||
| object_key: str | ||
| the associated key | ||
| """ | ||
| if not is_s3_url(s3_uri): | ||
| raise ValueError("Invalid S3 URI scheme. It should be 's3'.") | ||
|
|
||
| parsed_uri = urlparse(s3_uri) | ||
| bucket_name = parsed_uri.netloc | ||
| object_key = parsed_uri.path.lstrip("/") | ||
|
|
||
| return bucket_name, object_key | ||
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| def get_s3_paths(path_prefix: str, contains: str | None = None, suffix: str | None = None) -> list[str]: | ||
| """ | ||
| Gets all s3 paths in the path_prefix that contain 'contains' | ||
| and end with 'suffix.' | ||
|
|
||
| Params: | ||
| ------- | ||
| path_prefix: str | ||
| The path prefix. | ||
| contains : Optional[str], default = None | ||
| Can be specified to limit the returned outputs. | ||
| For example, by specifying the constraint, such as ".1h8c." | ||
| suffix: str, default = None | ||
| Can be specified to limit the returned outputs. | ||
| For example, by specifying "leaderboard.csv" only objects ending | ||
| with this suffix will be included | ||
| If no suffix provided, will save all files in artifact directory. | ||
|
|
||
| Returns: | ||
| -------- | ||
| List[str]: All s3 paths that adhere to the conditions passed in. | ||
| """ | ||
| bucket, prefix = s3_utils.s3_path_to_bucket_prefix(path_prefix) | ||
| objects = load_s3.list_bucket_prefix_suffix_contains_s3( | ||
| bucket=bucket, prefix=prefix, suffix=suffix, contains=contains | ||
| ) | ||
| paths_full = [s3_utils.s3_bucket_prefix_to_path(bucket=bucket, prefix=file, version="s3") for file in objects] | ||
| return paths_full | ||
|
|
||
|
|
||
| def copy_s3_object(origin_path: str, destination_path: str) -> bool: | ||
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| """ | ||
| Copies s3 object from origin_path to destination_path | ||
|
|
||
| Params: | ||
| ------- | ||
| origin_path: str | ||
| The path (s3 uri) to the original location of the object | ||
| destination_path: str | ||
| The path (s3 uri) to the intended destination location of the object | ||
|
|
||
| Returns: | ||
| -------- | ||
| bool: whether the copy was successful. | ||
| """ | ||
| origin_bucket, origin_key = get_bucket_key(origin_path) | ||
| destination_bucket, destination_key = get_bucket_key(destination_path) | ||
|
|
||
| try: | ||
| s3 = boto3.client("s3") | ||
| s3.copy_object( | ||
| Bucket=destination_bucket, CopySource={"Bucket": origin_bucket, "Key": origin_key}, Key=destination_key | ||
| ) | ||
| return True | ||
| except: | ||
| pass | ||
adibiasio marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| return False | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.