-
Notifications
You must be signed in to change notification settings - Fork 1
metal/tt-lang single and multi-core matmul #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
examples/metal_examples/singlecore_matmul/metal/kernels/mm_reader.cpp
Outdated
Show resolved
Hide resolved
examples/metal_examples/multicore_matmul/ttlang/multicore_matmul.py
Outdated
Show resolved
Hide resolved
examples/metal_examples/multicore_matmul/ttlang/multicore_matmul.py
Outdated
Show resolved
Hide resolved
|
Looks like |
examples/metal_examples/multicore_matmul/ttlang/multicore_matmul.py
Outdated
Show resolved
Hide resolved
examples/metal_examples/singlecore_matmul/ttlang/singlecore_matmul.py
Outdated
Show resolved
Hide resolved
examples/metal_examples/multicore_matmul/metal/kernels/mm_compute.cpp
Outdated
Show resolved
Hide resolved
examples/metal_examples/multicore_matmul/metal/multicore_matmul.py
Outdated
Show resolved
Hide resolved
examples/metal_examples/singlecore_matmul/metal/kernels/mm_reader.cpp
Outdated
Show resolved
Hide resolved
|
Comparing to #57 there are some nice perf improvements in #57 (optimized dst register usage, block based processing, serializing noc transactions to prevent overlap). I think that's okay/good though, we want this to be dead simple right? Maybe we can bring in some of the patterns from #57 after as a more optimized example. |
I have PR's waiting to go after this one that do similar optimizations! it will be interesting to see how they are similar or do things differently |
zoecarver
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than my outstanding review comments, this LGTM. Thank you Alex!
I'm super excited to try running the ttlang test with our actual DSL/compiler and enumerate the things that we will need to implement to get them working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ion, and broken pytests (#208) - Problem: TTNN package instability and configure-time availability check - ttnn package in `tt-mlir/third_party/tt-metal` may disappear or change, and `ttlang_check_ttnn_available()` checked at configure time when ttnn only available after build. Fix: Created `cmake/modules/CopyTTNNPythonPackage.cmake` to copy ttnn to `build/python_packages/ttnn/` during build. Removed `ttlang_check_ttnn_available()` from `cmake/modules/TTLangUtils.cmake`. Tests use runtime checks (`REQUIRES: ttnn` directive, `pytest.importorskip()`). - Problem: CAPI library resolution issues - `python/CMakeLists.txt` could find wrong `libTTMLIRPythonCAPI.so` from different tt-mlir install at configure time, and Python extension could resolve wrong library at runtime, causing MLIRContext registry mismatches. Fix: Made configure-time search paths mutually exclusive (only search `TTMLIR_BUILD_DIR` when defined, else search `TTMLIR_PATH`). Added `BUILD_RPATH` and `INSTALL_RPATH` properties to constrain runtime library search to configured tt-mlir directories. - Problem: Missing pytest infrastructure and broken test files - no pytest check target or configuration, lit tests incorrectly collected by pytest, `test_elementwise_ops.py` imported from non-existent `examples/utils` (deleted in #67), `test_block_allocation.py` used wrong function name `new_split_work_to_cores` and unsafe ttnn import. Fix: Added `check-ttlang-pytest` target to `test/CMakeLists.txt`. Created `test/python/conftest.py` with feature detection, markers, fixtures, and `collect_ignore` list. Recovered the deleted `examples/utils.py` that was required in pytests -- now `test/python/utils.py`. Fixed corresponding imports and function names in test files, added `# REQUIRES: ttnn` directive, changed to `pytest.importorskip("ttnn")`. - Problem: `test/python/simple_add_multitile.py` fails on qb because of ordering differences. Fix: Use `CHECK-DAG:` appropriately.
Description
added metal and tt-lang (up to spec, not necessarily compiling) implementations of a single and multicore matmul that take in tilized inputs.
In particular looking to get feedback on tt-lang multicore implementation and how it handles work distribution.
Type of Change
Testing
on team box
build metal, with ./build_metal.sh -c -e --build-dir build --development --cxx-compiler-path=clang++-18 --c-compiler-path=clang-18
create env (with metal script)
run pytests in ./examples/metal_examples/[single|multi]core_matmul/metal/[single|multi]core_matmul.py