Sfpu recip sqrt implementation #1038

vkrsmanovicTT · 2025-12-30T21:23:52Z

Ticket

N/A - Initial SFPU rsqrt implementation for Quasar architecture
Problem description

The Quasar architecture needed an implementation of the rsqrt (reciprocal square root: 1/sqrt(x)) SFPU operation.
What's changed
Core Implementation:
Added ckernel_sfpu_rsqrt.h for Quasar with SFPU instructions to compute rsqrt using hardware sqrt and reciprocal operations
Implemented rsqrt by chaining SQRT_MODE and RECIP_MODE SFPU nonlinear instructions
Test Infrastructure:
Added test_sfpu_rsqrt_quasar.py Python test with random input generation in range [0.01, 1.0]
Added sfpu_rsqrt_quasar_test.cpp C++ kernel implementing datacopy + SFPU pipeline
Current Status:
Implementation is working for a limited set of format combinations (Float16 input/output)
Test sweep currently covers basic approximation mode and destination accumulation variants
Test sweep will be expanded in future updates to cover additional data formats, tile sizes, and edge cases
Type of change
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] Documentation update

github-actions · 2025-12-30T21:24:03Z

Thank you for your contribution! 🚀
If you want to run metal post-commit tests, you can add the metal-post-commit-tests label to this pull request.
📖 For more information, please refer to our CONTRIBUTING guide.

Copilot

Pull request overview

This PR implements the rsqrt (reciprocal square root: 1/sqrt(x)) SFPU operation for the Quasar architecture. The implementation chains hardware SQRT and RECIP operations to compute rsqrt, adds comprehensive test infrastructure with Python and C++ tests, and includes supporting changes to helper utilities.

Adds core rsqrt kernel implementation by chaining SQRT_MODE and RECIP_MODE SFPU instructions
Implements test infrastructure with random input generation in range [0.01, 1.0] and golden reference validation
Enhances PCC calculation to better handle edge cases with masked/invalid values

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tt_llk_quasar/llk_lib/llk_defs.h	Adds rsqrt enum value to SfpuType enumeration
tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h	Core rsqrt implementation chaining sqrt and reciprocal SFPU operations
tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp	C++ kernel test implementing datacopy + SFPU rsqrt pipeline
tests/python_tests/quasar/test_sfpu_rsqrt_quasar.py	Python test with parametrized configurations and golden reference validation
tests/python_tests/helpers/utils.py	Refactors PCC calculation to handle edge cases more robustly
tests/python_tests/helpers/test_variant_parameters.py	Updates operation constant generation to handle Quasar SfpuType namespace
tests/python_tests/helpers/test_config.py	Adds null check for dest_acc before format inference
tests/python_tests/helpers/device.py	Skips assert handling for Quasar architecture

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h

tests/python_tests/helpers/utils.py

tests/python_tests/helpers/test_variant_parameters.py

tests/python_tests/helpers/test_config.py

- Implement rsqrt operation test for Quasar architecture - Fix dvalid synchronization for 3-stage FPU->SFPU->PACK pipeline - Call set_up_dest_dvalid_per_thread twice (FPU and SFPU) in MATH kernel - Support Float16, Float16_b, and Float32 formats - Support 32x32 and 64x64 tile dimensions - Skip unsupported format combinations (non-Float32->Float32 with dest_acc=No, Float32->Float16 with dest_acc=No) - Use _llk_math_eltwise_unary_sfpu_params_ wrapper for proper face iteration - Correct SFPU iteration count per face (not total) - Add wait_idle calls at end of MATH kernel

tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h

…quasar

tests/python_tests/quasar/test_sfpu_rsqrt_quasar.py

tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp

tt_llk_quasar/llk_lib/llk_math_eltwise_unary_sfpu_common.h

tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h

tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp

tests/python_tests/quasar/test_sfpu_rsqrt_quasar.py

vkrsmanovicTT added 7 commits December 17, 2025 17:13

draft sfpu for quasar test

6e309b0

Merge branch 'main' into sfpu_merge

1354fc9

Merge branch 'main' into sfpu_merge

d5aa344

Merge branch 'main' into sfpu_merge

95780ef

Merge latest main into sfpu_merge

c3af10f

sfpu rsqrt implementation sweep

745750a

random number for approx mode generation issue

5a5963e

Copilot AI review requested due to automatic review settings December 30, 2025 21:23

vkrsmanovicTT requested review from amokanTT, fvranicTT, ldjurovicTT, nvelickovicTT, skotaracTT, skrsmanovicTT and sstanisicTT as code owners December 30, 2025 21:23

Copilot started reviewing on behalf of vkrsmanovicTT December 30, 2025 21:24 View session

github-actions bot added quasar test-infra This label is used for issues, pull requests, or tasks related to the LLK testing framework labels Dec 30, 2025

input values for sfpu

75b21a2

Copilot AI reviewed Dec 30, 2025

View reviewed changes

tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h Outdated Show resolved Hide resolved

tests/python_tests/helpers/utils.py Outdated Show resolved Hide resolved

fvranicTT reviewed Dec 31, 2025

View reviewed changes

tests/python_tests/helpers/test_variant_parameters.py Outdated Show resolved Hide resolved

fvranicTT reviewed Dec 31, 2025

View reviewed changes

tests/python_tests/helpers/test_config.py Outdated Show resolved Hide resolved

fvranicTT and others added 7 commits December 31, 2025 13:08

Merge branch 'main' into sfpu_merge

cc35c55

Merge branch 'main' into sfpu_merge

3256b32

Data copy working for values from 0.1 to 2 for approx mode

f00e2b4

Merge branch 'main' into sfpu_merge

ee889d0

SFPU issue handling for math

572f4a0

Merge branch 'main' into sfpu_merge

3746b88

fvranicTT reviewed Jan 14, 2026

View reviewed changes

tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h Outdated Show resolved Hide resolved

vkrsmanovicTT added 9 commits January 14, 2026 17:29

Addressing comments

16a3e6d

expanding range test

52be245

Fix SFPU rsqrt test when unpack_to_dest is true

1127856

sync for float32 input format on quasar sfpu!

686ba45

Merge branch 'main' into sfpu_merge

4263c5c

Skip Float16_b to Float32 conversion with dest_acc=No for Quasar

e34bd5a

Remove redundant skip condition for Float16_b to Float32

16f68bf

Merge branch 'main' into sfpu_merge

55383fc

Filter invalid format combinations at generation time like test_pack_…

0991117

…quasar

fvranicTT reviewed Jan 20, 2026

View reviewed changes

tests/python_tests/quasar/test_sfpu_rsqrt_quasar.py Show resolved Hide resolved

fvranicTT reviewed Jan 20, 2026

View reviewed changes

tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp Outdated Show resolved Hide resolved

fvranicTT reviewed Jan 20, 2026

View reviewed changes

tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp Outdated Show resolved Hide resolved

vkrsmanovicTT added 4 commits January 20, 2026 16:40

Merge branch 'main' into sfpu_merge

4dc6596

Addressing comments!

20062cd

Merge branch 'main' into sfpu_merge

ff5deef

expand range and remove approx mode

04e0288

rtawfik01 reviewed Jan 21, 2026

View reviewed changes

tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp Outdated Show resolved Hide resolved

vkrsmanovicTT added 2 commits January 21, 2026 21:17

removing approx mode from code

b3108bb

Merge branch 'main' into sfpu_merge

37f93fa

rtawfik01 approved these changes Jan 21, 2026

View reviewed changes

fvranicTT reviewed Jan 21, 2026

View reviewed changes

tt_llk_quasar/llk_lib/llk_math_eltwise_unary_sfpu_common.h Outdated Show resolved Hide resolved

reverting function implementation

3e3f537

fvranicTT approved these changes Jan 21, 2026

View reviewed changes

tt_llk_quasar/common/inc/sfpu/ckernel_sfpu_rsqrt.h Outdated Show resolved Hide resolved

tests/sources/quasar/sfpu_rsqrt_quasar_test.cpp Outdated Show resolved Hide resolved

tests/python_tests/quasar/test_sfpu_rsqrt_quasar.py Outdated Show resolved Hide resolved

changes 2025->2026

b066be7

vkrsmanovicTT enabled auto-merge January 21, 2026 22:14

vkrsmanovicTT added this pull request to the merge queue Jan 21, 2026

Merged via the queue into main with commit ab381b4 Jan 21, 2026
32 checks passed

vkrsmanovicTT deleted the sfpu_merge branch January 21, 2026 23:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sfpu recip sqrt implementation #1038

Sfpu recip sqrt implementation #1038

vkrsmanovicTT commented Dec 30, 2025

Uh oh!

github-actions bot commented Dec 30, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Sfpu recip sqrt implementation #1038

Sfpu recip sqrt implementation #1038

Conversation

vkrsmanovicTT commented Dec 30, 2025

Uh oh!

github-actions bot commented Dec 30, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants