-
Notifications
You must be signed in to change notification settings - Fork 22
Sfpu recip sqrt implementation #1038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution! 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements the rsqrt (reciprocal square root: 1/sqrt(x)) SFPU operation for the Quasar architecture. The implementation chains hardware SQRT and RECIP operations to compute rsqrt, adds comprehensive test infrastructure with Python and C++ tests, and includes supporting changes to helper utilities.
- Adds core rsqrt kernel implementation by chaining SQRT_MODE and RECIP_MODE SFPU instructions
- Implements test infrastructure with random input generation in range [0.01, 1.0] and golden reference validation
- Enhances PCC calculation to better handle edge cases with masked/invalid values
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tt_llk_quasar/llk_lib/llk_defs.h | Adds rsqrt enum value to SfpuType enumeration |
| tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h | Core rsqrt implementation chaining sqrt and reciprocal SFPU operations |
| tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp | C++ kernel test implementing datacopy + SFPU rsqrt pipeline |
| tests/python_tests/quasar/test_sfpu_rsqrt_quasar.py | Python test with parametrized configurations and golden reference validation |
| tests/python_tests/helpers/utils.py | Refactors PCC calculation to handle edge cases more robustly |
| tests/python_tests/helpers/test_variant_parameters.py | Updates operation constant generation to handle Quasar SfpuType namespace |
| tests/python_tests/helpers/test_config.py | Adds null check for dest_acc before format inference |
| tests/python_tests/helpers/device.py | Skips assert handling for Quasar architecture |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Implement rsqrt operation test for Quasar architecture - Fix dvalid synchronization for 3-stage FPU->SFPU->PACK pipeline - Call set_up_dest_dvalid_per_thread twice (FPU and SFPU) in MATH kernel - Support Float16, Float16_b, and Float32 formats - Support 32x32 and 64x64 tile dimensions - Skip unsupported format combinations (non-Float32->Float32 with dest_acc=No, Float32->Float16 with dest_acc=No) - Use _llk_math_eltwise_unary_sfpu_params_ wrapper for proper face iteration - Correct SFPU iteration count per face (not total) - Add wait_idle calls at end of MATH kernel
Ticket
N/A - Initial SFPU rsqrt implementation for Quasar architecture
Problem description
The Quasar architecture needed an implementation of the rsqrt (reciprocal square root: 1/sqrt(x)) SFPU operation.
What's changed
Core Implementation:
Added ckernel_sfpu_rsqrt.h for Quasar with SFPU instructions to compute rsqrt using hardware sqrt and reciprocal operations
Implemented rsqrt by chaining SQRT_MODE and RECIP_MODE SFPU nonlinear instructions
Test Infrastructure:
Added test_sfpu_rsqrt_quasar.py Python test with random input generation in range [0.01, 1.0]
Added sfpu_rsqrt_quasar_test.cpp C++ kernel implementing datacopy + SFPU pipeline
Current Status:
Implementation is working for a limited set of format combinations (Float16 input/output)
Test sweep currently covers basic approximation mode and destination accumulation variants
Test sweep will be expanded in future updates to cover additional data formats, tile sizes, and edge cases
Type of change
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] Documentation update