Skip to content

Conversation

@arichinsTT
Copy link
Contributor

@arichinsTT arichinsTT commented Dec 9, 2025

Description

added metal and tt-lang (up to spec, not necessarily compiling) implementations of a single and multicore matmul that take in tilized inputs.
In particular looking to get feedback on tt-lang multicore implementation and how it handles work distribution.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Build/CI changes

Testing

on team box
build metal, with ./build_metal.sh -c -e --build-dir build --development --cxx-compiler-path=clang++-18 --c-compiler-path=clang-18
create env (with metal script)
run pytests in ./examples/metal_examples/[single|multi]core_matmul/metal/[single|multi]core_matmul.py

@arichinsTT arichinsTT requested a review from a team as a code owner December 9, 2025 17:48
@phizalev-TT
Copy link
Contributor

Looks like examples/metal_examples/multicore_reuse_bcast_matmul/metal/metal_mc_r_bc_matmul.py is empty. Should it be removed from PR?

@arichinsTT arichinsTT added this to the M0 milestone Dec 18, 2025
@zoecarver
Copy link
Contributor

Comparing to #57 there are some nice perf improvements in #57 (optimized dst register usage, block based processing, serializing noc transactions to prevent overlap). I think that's okay/good though, we want this to be dead simple right? Maybe we can bring in some of the patterns from #57 after as a more optimized example.

@arichinsTT
Copy link
Contributor Author

Comparing to #57 there are some nice perf improvements in #57 (optimized dst register usage, block based processing, serializing noc transactions to prevent overlap). I think that's okay/good though, we want this to be dead simple right? Maybe we can bring in some of the patterns from #57 after as a more optimized example.

I have PR's waiting to go after this one that do similar optimizations! it will be interesting to see how they are similar or do things differently

Copy link
Contributor

@zoecarver zoecarver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than my outstanding review comments, this LGTM. Thank you Alex!

I'm super excited to try running the ttlang test with our actual DSL/compiler and enumerate the things that we will need to implement to get them working.

@arichinsTT arichinsTT merged commit 2b45b73 into main Dec 29, 2025
5 checks passed
@arichinsTT arichinsTT deleted the arichins/singleCoreMatmul branch December 29, 2025 21:25
Copy link
Contributor

@brnorris03 brnorris03 Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is used by the new pytests (added in #159) so they are not runnable atm. I will undelete in #203 and place in a better location (doesn't belong in examples at this point).

brnorris03 added a commit that referenced this pull request Jan 6, 2026
…ion, and broken pytests (#208)

- Problem: TTNN package instability and configure-time availability
check - ttnn package in `tt-mlir/third_party/tt-metal` may disappear or
change, and `ttlang_check_ttnn_available()` checked at configure time
when ttnn only available after build.
Fix: Created `cmake/modules/CopyTTNNPythonPackage.cmake` to copy ttnn to
`build/python_packages/ttnn/` during build. Removed
`ttlang_check_ttnn_available()` from `cmake/modules/TTLangUtils.cmake`.
Tests use runtime checks (`REQUIRES: ttnn` directive,
`pytest.importorskip()`).

- Problem: CAPI library resolution issues - `python/CMakeLists.txt`
could find wrong `libTTMLIRPythonCAPI.so` from different tt-mlir install
at configure time, and Python extension could resolve wrong library at
runtime, causing MLIRContext registry mismatches.
Fix: Made configure-time search paths mutually exclusive (only search
`TTMLIR_BUILD_DIR` when defined, else search `TTMLIR_PATH`). Added
`BUILD_RPATH` and `INSTALL_RPATH` properties to constrain runtime
library search to configured tt-mlir directories.

- Problem: Missing pytest infrastructure and broken test files - no
pytest check target or configuration, lit tests incorrectly collected by
pytest, `test_elementwise_ops.py` imported from non-existent
`examples/utils` (deleted in #67), `test_block_allocation.py` used wrong
function name `new_split_work_to_cores` and unsafe ttnn import.
Fix: Added `check-ttlang-pytest` target to `test/CMakeLists.txt`.
Created `test/python/conftest.py` with feature detection, markers,
fixtures, and `collect_ignore` list. Recovered the deleted
`examples/utils.py` that was required in pytests -- now
`test/python/utils.py`. Fixed corresponding imports and function names
in test files, added `# REQUIRES: ttnn` directive, changed to
`pytest.importorskip("ttnn")`.
  
- Problem: `test/python/simple_add_multitile.py` fails on qb because of
ordering differences.
  Fix: Use `CHECK-DAG:` appropriately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants