Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dataset streaming, cleanup diagram #497

Merged
merged 2 commits into from
Feb 21, 2025
Merged

Conversation

rishic3
Copy link
Collaborator

@rishic3 rishic3 commented Feb 16, 2025

Use streaming to avoid saving entire Huggingface dataset to disk for large datasets. Updated diagram for clarity regarding client/server interaction.

@rishic3 rishic3 marked this pull request as ready for review February 18, 2025 16:30
@rishic3 rishic3 requested a review from eordentlich February 19, 2025 01:06
Signed-off-by: Rishi Chandra <[email protected]>
eordentlich
eordentlich previously approved these changes Feb 20, 2025
Copy link
Collaborator

@eordentlich eordentlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rishic3
Copy link
Collaborator Author

rishic3 commented Feb 20, 2025

Updated the links to point to the actual notebooks since that makes more sense; notebooks have a link to their reference example in their header cell already.

Copy link
Collaborator

@eordentlich eordentlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rishic3 rishic3 merged commit efd0ed6 into NVIDIA:branch-25.02 Feb 21, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants