-
Notifications
You must be signed in to change notification settings - Fork 22
Adding base eltwise_binary test for WH/BH #1047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for your contribution! 🚀 |
1d22eb3 to
3e7877f
Compare
2fa9902 to
3a1f8bb
Compare
3a1f8bb to
935cbfb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a base eltwise binary test for Wormhole (WH) and Blackhole (BH) architectures without broadcast or transpose functionality, and refactors the existing LLK math code to reduce duplication while enabling future tile dimension support.
Changes:
- Refactored LLK math eltwise binary code for WH/BH to consolidate duplicate code patterns using helper functions
- Added new base eltwise binary test infrastructure supporting variable tile dimensions and multiple tile blocks
- Extended Python test helpers to support variable tile dimensions (not just 32x32)
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tt_llk_wormhole_b0/llk_lib/llk_math_eltwise_binary.h | Refactored to add helper functions eltwise_binary_func and eltwise_binary_reuse_dest_helper_func to eliminate code duplication in ELWMUL path |
| tt_llk_blackhole/llk_lib/llk_math_eltwise_binary.h | Similar refactoring as Wormhole; moved eltwise_binary_configure_addrmod to top of file |
| tests/sources/eltwise_binary_test.cpp | New C++ test implementing unpack, math, and pack kernels for basic eltwise binary operations with block/tile support |
| tests/python_tests/test_eltwise_binary.py | New Python test file with parametrized test setup for eltwise binary operations |
| tests/python_tests/helpers/unpack.py | Updated unpack_res_tiles to support variable tile dimensions |
| tests/python_tests/helpers/tilize_untilize.py | Added tilize_tile function and updated tilize_block to support variable tile dimensions |
| tests/python_tests/helpers/test_variant_parameters.py | Added NUM_TILES_IN_BLOCK and NUM_BLOCKS runtime parameters |
| tests/python_tests/helpers/test_config.py | Updated hash generation to include stimuli address-related fields |
| tests/python_tests/helpers/stimuli_generator.py | Added sequential_A and sequential_B parameters for generating sequential test data |
| tests/python_tests/helpers/stimuli_config.py | Updated to use tile_elements parameter instead of hardcoded TILE_ELEMENTS |
| tests/python_tests/helpers/golden_generators.py | Updated TilizeGolden to support variable tile dimensions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| assert len(res_from_L1) == len( | ||
| golden_tensor | ||
| ), "Result tensor and golder tensor are not of the same length" |
Copilot
AI
Jan 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "golder" should be "golden".
| ), "Result tensor and golder tensor are not of the same length" | |
| ), "Result tensor and golden tensor are not of the same length" |
935cbfb to
619e1fb
Compare
ncvetkovicTT
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a lot cleaner ^^
| const int num_blocks = params->NUM_BLOCKS; | ||
|
|
||
| // Configure hardware for unpacking, no broadcast, no transpose | ||
| _llk_unpack_hw_configure_<is_fp32_dest_acc_en>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elsewhere in the code of the kernel you added false/true even for "unused" or rather "don't care" parameters. Can we do the same here for disable_src_zero_flag? I prefer having:
| _llk_unpack_hw_configure_<is_fp32_dest_acc_en>( | |
| _llk_unpack_hw_configure_<is_fp32_dest_acc_en, false /*disable_src_zero_flag*/>( |
since it will be easier for us to make the changes at the place of calling _llk_unpack_hw_configure_ when we remove disable_src_zero_flag from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
| } | ||
| } | ||
| constexpr uint32_t outerloop = (high_fidelity) ? 2 : ((binary_reuse_dest != EltwiseBinaryReuseDestType::NONE) ? 2 : 1); | ||
| eltwise_binary_reuse_dest_helper_func<is_fp32_dest_acc_en, binary_reuse_dest>(outerloop, 0, clear_fp32_dst_acc, dst_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| eltwise_binary_reuse_dest_helper_func<is_fp32_dest_acc_en, binary_reuse_dest>(outerloop, 0, clear_fp32_dst_acc, dst_index); | |
| eltwise_binary_reuse_dest_helper_func<is_fp32_dest_acc_en, binary_reuse_dest>(outerloop, 0 /*face_base_offset*/, clear_fp32_dst_acc, dst_index); |
| } | ||
| tmp.program(); | ||
| } | ||
| ckernel_template tmp(outerloop, innerloop, eltwise_binary_func<eltwise_binary_type>(0, acc_to_dest, broadcast_type, addr_mod)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ckernel_template tmp(outerloop, innerloop, eltwise_binary_func<eltwise_binary_type>(0, acc_to_dest, broadcast_type, addr_mod)); | |
| ckernel_template tmp(outerloop, innerloop, eltwise_binary_func<eltwise_binary_type>(0 /*clr_src*/, acc_to_dest, broadcast_type, addr_mod)); |
| ckernel_template tmp(outerloop, innerloop, TT_OP_ELWSUB(0, acc_to_dest, broadcast_type, addr_mod, 0)); | ||
| tmp.set_end_op(TT_OP_SETRWC(p_setrwc::CLR_AB, p_setrwc::CR_AB, 0, 0, 0, p_setrwc::SET_AB)); | ||
| tmp.program(); | ||
| tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop | |
| tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0 /*clr_src*/, 0 /*acc_to_dest*/, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop |
| tmp.program(); | ||
| } | ||
| else if constexpr (eltwise_binary_type == ELWSUB) | ||
| ckernel_template tmp(high_fidelity ? NUM_FIDELITY_PHASES : outerloop, innerloop, eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, addr_mod)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ckernel_template tmp(high_fidelity ? NUM_FIDELITY_PHASES : outerloop, innerloop, eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, addr_mod)); | |
| ckernel_template tmp(high_fidelity ? NUM_FIDELITY_PHASES : outerloop, innerloop, eltwise_binary_func<ELWMUL>(0 /*clr_src*/, 0 /*acc_to_dest*/, broadcast_type, addr_mod)); |
| tmp.set_end_op(TT_OP_SETRWC(p_setrwc::CLR_AB, p_setrwc::CR_AB, 0, 0, 0, p_setrwc::SET_AB)); | ||
| tmp.program(); | ||
| tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop | ||
| tmp.set_last_outer_loop_instr(eltwise_binary_func<ELWMUL>(CLR_SRC, 0, broadcast_type, ADDR_MOD_3)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tmp.set_last_outer_loop_instr(eltwise_binary_func<ELWMUL>(CLR_SRC, 0, broadcast_type, ADDR_MOD_3)); | |
| tmp.set_last_outer_loop_instr(eltwise_binary_func<ELWMUL>(CLR_SRC, 0 /*acc_to_dest*/, broadcast_type, ADDR_MOD_3)); |
fa5be5c to
d6cf043
Compare
| # Elements to pack per tile: | ||
| # - For tilize tests (write_full_tiles=True): write all 1024 elements | ||
| # - For other tests: write only the faces we care about | ||
| if write_full_tiles: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amokanTT for some reason I have to add this hack, or else test_tilize_sweep fails, I don't understand why test_tilize_sweep even for smaller tile dimensions needs full tiles in L1? I think this is an issue with tilize test
2c89ee1 to
1fbf97b
Compare
Ticket
#1034
Problem description
Adding Eltwise binary base test + enabling future tile size support for WH/BH
What's changed
For test infra:
For llk_math_eltwise_binary.h, refactoring changes to enable piping tile dim support (too many duplications meant piping redundant tile dims):
Type of change
Checklist