-
Notifications
You must be signed in to change notification settings - Fork 101
[rocprofiler-compute] Analysis Database Schema Improvements (v1.2.0) #2526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
[rocprofiler-compute] Analysis Database Schema Improvements (v1.2.0) #2526
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR updates the analysis database schema from v1.1.0 to v1.2.0 to better model the relationship between data tables and kernels. The key change is moving pc_sampling and roofline_data tables to relate directly to the kernel table instead of the workload table, which more accurately reflects the data structure where these metrics are per-kernel rather than per-workload.
Key changes:
- Schema relationships refactored so
pc_samplingandroofline_datanow reference kernels instead of workloads - Removed redundant
kernel_namefields frompc_samplingandroofline_datatables since this information is now available through the foreign key relationship - Updated database population logic to create kernel objects first, then link roofline and PC sampling data to them
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| projects/rocprofiler-compute/src/utils/analysis_orm.py | Updated ORM schema: incremented version to 1.2.0, moved relationships from Workload to Kernel for RooflineData and PCsampling tables, removed redundant kernel_name columns |
| projects/rocprofiler-compute/src/rocprof_compute_analyze/analysis_db.py | Refactored database population logic to create kernels from dispatch data first, then link roofline and PC sampling data to kernel objects; added error handling for missing kernels in PC sampling data |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@vedithal-amd I've opened a new pull request, #2536, to work on those changes. Once the pull request is ready, I'll request review from you. |
* `pc_sampling` and `roofline_data` tables should relate to `kernel` table instead of `workload` table * Remove `kernel_name` fields in `pc_sampling` and `roofline_data` table
* Initial plan * Add kernel existence check for roofline data to prevent KeyError Co-authored-by: vedithal-amd <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: vedithal-amd <[email protected]>
a572b2f to
ed45b0b
Compare
Reorganize the database ORM to decouple metric definitions from kernel objects. This improves the schema design by: - Rename Metric -> MetricDefinition and Value -> MetricValue for clarity - Move metric definitions from kernel-level to workload-level, since metric definitions are shared across kernels - Update relationships: MetricDefinition belongs to Workload, MetricValue references both MetricDefinition and Kernel - Refactor metric_view to join through the new schema structure - Update test fixtures to use renamed table and class names - Update documentation with new example output using nbody workload - Regenerate database schema and views diagrams
|
@feizheng10 and @cfallows-amd could you please review these analysis database improvements |
|
Schema update looks good. |
feizheng10
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
Motivation
The analysis database schema had several issues impacting performance, correctness, and data modeling:
Technical Details
Schema Refactoring
pc_samplingandroofline_datato referencekerneltable directly instead ofworkload, removing redundantkernel_namefieldsMetric→MetricDefinitionandValue→MetricValuefor clarityMetricDefinitionfrom kernel-level to workload-level (metrics shared across kernels)MetricValueto reference bothMetricDefinitionandKernelmetric_viewto join through new schema structurePerformance Optimization
Bug Fixes
Updates
compute_metric_definition,compute_metric_value)JIRA ID
AIPROFCOMP-62
Test Plan
Ensure test_analyze_rocpd test passes in tests/test_profile_general.py
Test Result
Tests pass
Submission Checklist