Skip to content

Conversation

@rtawfik01
Copy link
Contributor

@rtawfik01 rtawfik01 commented Jan 2, 2026

Ticket

#1034

Problem description

Adding Eltwise binary base test + enabling future tile size support for WH/BH

What's changed

For test infra:

  • Added base eltwise binary test for WH/BH with no broadcast or transpose
  • Enabled base test to have multiple tile blocks (dest switches) + enable tile dims

For llk_math_eltwise_binary.h, refactoring changes to enable piping tile dim support (too many duplications meant piping redundant tile dims):

  • Placed all duplicated ZEROACC code into helper function
  • Placed all duplicated ELW* code into a helper function
  • Consolidated duplicated code branches

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Checklist

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2026

Thank you for your contribution! 🚀
If you want to run metal post-commit tests, you can add the metal-post-commit-tests label to this pull request.
📖 For more information, please refer to our CONTRIBUTING guide.

@github-actions github-actions bot added test-infra This label is used for issues, pull requests, or tasks related to the LLK testing framework blackhole labels Jan 2, 2026
@rtawfik01 rtawfik01 force-pushed the rtawfik/add-binary-tile-shapes branch from 1d22eb3 to 3e7877f Compare January 2, 2026 22:54
@rtawfik01 rtawfik01 force-pushed the rtawfik/add-binary-tile-shapes branch 4 times, most recently from 2fa9902 to 3a1f8bb Compare January 7, 2026 20:10
@rtawfik01 rtawfik01 force-pushed the rtawfik/add-binary-tile-shapes branch from 3a1f8bb to 935cbfb Compare January 10, 2026 04:45
@rtawfik01 rtawfik01 marked this pull request as ready for review January 10, 2026 04:46
Copilot AI review requested due to automatic review settings January 10, 2026 04:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a base eltwise binary test for Wormhole (WH) and Blackhole (BH) architectures without broadcast or transpose functionality, and refactors the existing LLK math code to reduce duplication while enabling future tile dimension support.

Changes:

  • Refactored LLK math eltwise binary code for WH/BH to consolidate duplicate code patterns using helper functions
  • Added new base eltwise binary test infrastructure supporting variable tile dimensions and multiple tile blocks
  • Extended Python test helpers to support variable tile dimensions (not just 32x32)

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tt_llk_wormhole_b0/llk_lib/llk_math_eltwise_binary.h Refactored to add helper functions eltwise_binary_func and eltwise_binary_reuse_dest_helper_func to eliminate code duplication in ELWMUL path
tt_llk_blackhole/llk_lib/llk_math_eltwise_binary.h Similar refactoring as Wormhole; moved eltwise_binary_configure_addrmod to top of file
tests/sources/eltwise_binary_test.cpp New C++ test implementing unpack, math, and pack kernels for basic eltwise binary operations with block/tile support
tests/python_tests/test_eltwise_binary.py New Python test file with parametrized test setup for eltwise binary operations
tests/python_tests/helpers/unpack.py Updated unpack_res_tiles to support variable tile dimensions
tests/python_tests/helpers/tilize_untilize.py Added tilize_tile function and updated tilize_block to support variable tile dimensions
tests/python_tests/helpers/test_variant_parameters.py Added NUM_TILES_IN_BLOCK and NUM_BLOCKS runtime parameters
tests/python_tests/helpers/test_config.py Updated hash generation to include stimuli address-related fields
tests/python_tests/helpers/stimuli_generator.py Added sequential_A and sequential_B parameters for generating sequential test data
tests/python_tests/helpers/stimuli_config.py Updated to use tile_elements parameter instead of hardcoded TILE_ELEMENTS
tests/python_tests/helpers/golden_generators.py Updated TilizeGolden to support variable tile dimensions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


assert len(res_from_L1) == len(
golden_tensor
), "Result tensor and golder tensor are not of the same length"
Copy link

Copilot AI Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "golder" should be "golden".

Suggested change
), "Result tensor and golder tensor are not of the same length"
), "Result tensor and golden tensor are not of the same length"

Copilot uses AI. Check for mistakes.
@rtawfik01 rtawfik01 force-pushed the rtawfik/add-binary-tile-shapes branch from 935cbfb to 619e1fb Compare January 10, 2026 04:52
Copy link
Contributor

@ncvetkovicTT ncvetkovicTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a lot cleaner ^^

const int num_blocks = params->NUM_BLOCKS;

// Configure hardware for unpacking, no broadcast, no transpose
_llk_unpack_hw_configure_<is_fp32_dest_acc_en>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere in the code of the kernel you added false/true even for "unused" or rather "don't care" parameters. Can we do the same here for disable_src_zero_flag? I prefer having:

Suggested change
_llk_unpack_hw_configure_<is_fp32_dest_acc_en>(
_llk_unpack_hw_configure_<is_fp32_dest_acc_en, false /*disable_src_zero_flag*/>(

since it will be easier for us to make the changes at the place of calling _llk_unpack_hw_configure_ when we remove disable_src_zero_flag from it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

}
}
constexpr uint32_t outerloop = (high_fidelity) ? 2 : ((binary_reuse_dest != EltwiseBinaryReuseDestType::NONE) ? 2 : 1);
eltwise_binary_reuse_dest_helper_func<is_fp32_dest_acc_en, binary_reuse_dest>(outerloop, 0, clear_fp32_dst_acc, dst_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
eltwise_binary_reuse_dest_helper_func<is_fp32_dest_acc_en, binary_reuse_dest>(outerloop, 0, clear_fp32_dst_acc, dst_index);
eltwise_binary_reuse_dest_helper_func<is_fp32_dest_acc_en, binary_reuse_dest>(outerloop, 0 /*face_base_offset*/, clear_fp32_dst_acc, dst_index);

}
tmp.program();
}
ckernel_template tmp(outerloop, innerloop, eltwise_binary_func<eltwise_binary_type>(0, acc_to_dest, broadcast_type, addr_mod));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ckernel_template tmp(outerloop, innerloop, eltwise_binary_func<eltwise_binary_type>(0, acc_to_dest, broadcast_type, addr_mod));
ckernel_template tmp(outerloop, innerloop, eltwise_binary_func<eltwise_binary_type>(0 /*clr_src*/, acc_to_dest, broadcast_type, addr_mod));

ckernel_template tmp(outerloop, innerloop, TT_OP_ELWSUB(0, acc_to_dest, broadcast_type, addr_mod, 0));
tmp.set_end_op(TT_OP_SETRWC(p_setrwc::CLR_AB, p_setrwc::CR_AB, 0, 0, 0, p_setrwc::SET_AB));
tmp.program();
tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop
tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0 /*clr_src*/, 0 /*acc_to_dest*/, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop

tmp.program();
}
else if constexpr (eltwise_binary_type == ELWSUB)
ckernel_template tmp(high_fidelity ? NUM_FIDELITY_PHASES : outerloop, innerloop, eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, addr_mod));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ckernel_template tmp(high_fidelity ? NUM_FIDELITY_PHASES : outerloop, innerloop, eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, addr_mod));
ckernel_template tmp(high_fidelity ? NUM_FIDELITY_PHASES : outerloop, innerloop, eltwise_binary_func<ELWMUL>(0 /*clr_src*/, 0 /*acc_to_dest*/, broadcast_type, addr_mod));

tmp.set_end_op(TT_OP_SETRWC(p_setrwc::CLR_AB, p_setrwc::CR_AB, 0, 0, 0, p_setrwc::SET_AB));
tmp.program();
tmp.set_last_inner_loop_instr(eltwise_binary_func<ELWMUL>(0, 0, broadcast_type, ADDR_MOD_2)); // Incr fidelity last inst of inner loop
tmp.set_last_outer_loop_instr(eltwise_binary_func<ELWMUL>(CLR_SRC, 0, broadcast_type, ADDR_MOD_3));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tmp.set_last_outer_loop_instr(eltwise_binary_func<ELWMUL>(CLR_SRC, 0, broadcast_type, ADDR_MOD_3));
tmp.set_last_outer_loop_instr(eltwise_binary_func<ELWMUL>(CLR_SRC, 0 /*acc_to_dest*/, broadcast_type, ADDR_MOD_3));

@rtawfik01 rtawfik01 force-pushed the rtawfik/add-binary-tile-shapes branch 6 times, most recently from fa5be5c to d6cf043 Compare January 14, 2026 21:48
# Elements to pack per tile:
# - For tilize tests (write_full_tiles=True): write all 1024 elements
# - For other tests: write only the faces we care about
if write_full_tiles:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amokanTT for some reason I have to add this hack, or else test_tilize_sweep fails, I don't understand why test_tilize_sweep even for smaller tile dimensions needs full tiles in L1? I think this is an issue with tilize test

@rtawfik01 rtawfik01 force-pushed the rtawfik/add-binary-tile-shapes branch from 2c89ee1 to 1fbf97b Compare January 16, 2026 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blackhole test-infra This label is used for issues, pull requests, or tasks related to the LLK testing framework wormhole

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants