Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate sparsity checks in CLI and CXG conversion steps. #7308

Open
Bento007 opened this issue Jul 25, 2024 · 3 comments
Open

Consolidate sparsity checks in CLI and CXG conversion steps. #7308

Bento007 opened this issue Jul 25, 2024 · 3 comments
Labels
dp Data Platform workstream tech Tech issues that do not require product prioritization. Tech debt, tooling, ops, etc.

Comments

@Bento007
Copy link
Contributor

Motivation

We are running sparsity check in two place within our ingestion pipeline. Once during validation and again during cxg conversion. These are two separate algorithm doing the same thing. For efficiency they should be consolidated.

Definition of Done

  • Consolidate the sparsity checks across the ingestion pipeline. Ideally it should only be calculated once and the result shared.

Tasks

  • I recommend running the sparsity check in the CLI and passing the value to the CXG conversion step.
@Bento007 Bento007 added dp Data Platform workstream tech Tech issues that do not require product prioritization. Tech debt, tooling, ops, etc. labels Jul 25, 2024
@nayib-jose-gloria
Copy link
Contributor

Estimate (incl. testing): 2-3 days

@Bento007 for comment if you agree

@Bento007
Copy link
Contributor Author

2-3 days if my suggested implementation is used.

@nayib-jose-gloria
Copy link
Contributor

deferring this to DP maintenance work since it is not root of memory issues, as determined during investigation of #7310

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dp Data Platform workstream tech Tech issues that do not require product prioritization. Tech debt, tooling, ops, etc.
Projects
None yet
Development

No branches or pull requests

2 participants