-
Notifications
You must be signed in to change notification settings - Fork 4.4k
cumulativesum vulkan shader #6475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6475 +/- ##
==========================================
+ Coverage 95.89% 95.90% +0.01%
==========================================
Files 844 845 +1
Lines 266044 265932 -112
==========================================
- Hits 255114 255042 -72
+ Misses 10930 10890 -40 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds Vulkan GPU shader implementation for the cumulative sum operation, enabling hardware-accelerated prefix sum computation across different tensor dimensions.
Key changes:
- Implements three-pass algorithm: block-level scan, block sums scan, and offset addition
- Supports 1D, 2D, and 3D tensors with axis selection
- Uses Kogge-Stone parallel scan algorithm with 256-element work groups
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/layer/vulkan/shader/cumulativesum_blockscan.comp | First pass shader that performs prefix scan within 256-element blocks |
| src/layer/vulkan/shader/cumulativesum_blocksums_scan.comp | Second pass shader that scans the block sums to compute offsets |
| src/layer/vulkan/shader/cumulativesum_addoffset.comp | Third pass shader that adds block offsets to complete the cumulative sum |
| src/layer/vulkan/cumulativesum_vulkan.h | Header declaring the Vulkan implementation class with three pipeline stages |
| src/layer/vulkan/cumulativesum_vulkan.cpp | Implementation managing pipeline creation and multi-pass execution strategy |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.