[test] Add me2e pytest framework for middle end #167

brnorris03 · 2025-12-25T18:28:25Z

What?

New pytest-based ME2E (middle-end to end-to-end) test framework in test/me2e/
for testing TTL MLIR generation, compilation, and hardware execution.

Why?

Unified framework for testing TTL MLIR generation, compilation, and hardware
execution
Declarative parametrized tests enable testing all ops with all configs via
single test function
Auto-generates test classes from TTLElementwiseOps.def keeping tests in sync
with dialect
ULP-based comparison utilities for hardware-accurate golden validation
Supports custom fused op tests via MLIR templates

How?

Two test approaches (both auto-generated from TTLElementwiseOps.def):

Declarative parametrized tests (primary): Single test_compute_ops.py
function parametrized over all operations and configurations. COMPUTE_OPS
in op_specs.py is auto-generated from the .def file. All pipeline stages
executed in one function via runner.py. Fast, clean, no artifacts.
Class-based tests: Test classes auto-generated from .def file. Classes
declare what to test (op name, arity, input range, tolerance) via class
attributes; base classes handle how (MLIR generation, compilation,
execution, validation). Saves intermediate (MLIR, CPP, results) artifacts
for debugging, supports custom (fused) ops via MLIR templates.

Architecture:

E2ETestBase: Ordered 5-stage pipeline (build → compile → translate → execute
→ validate)
OpTestBase: Adds OP_STR, ARITY, INPUT_RANGE; auto-generates from
.def file
FusedOpTestBase: Custom MLIR via get_mlir_template() override
(examples in test/me2e/ops/test_fused.py)
Host configuration and execution reuses the shared kernel_runner
module (python/ttl/kernel_runner.py)
Artifacts saved to build/test/me2e/<TestName>/ for debugging (class-based
tests) or temporary directories (declarative tests).

flowchart TD
    subgraph Inputs
        DEF[.def file]
        MLIR[Custom MLIR]
    end

    DEF --> GEN[Generate: TestAdd, TestExp...]
    DEF --> DECL[Declarative: test_compute_ops]
    MLIR --> FUSED[FusedOpTestBase]

    GEN --> P1
    FUSED --> P1
    DECL --> P2

    subgraph P1[Class-based Pipeline]
        S1[build] --> S2[compile]
        S2 --> S3[translate]
        S3 --> S4[execute]
        S4 --> S5[validate]
    end

    subgraph P2[Declarative Pipeline]
        D1[build] --> D2[compile]
        D2 --> D3[translate]
        D3 --> D4[execute + validate]
    end

    S1 --> O1[module.mlir]
    S2 --> O2[compiled.mlir]
    S3 --> O3[*.cpp]
    S4 --> O4[result.pt]

    D4 --> O5[in-memory validation]

Current status:

All 5 pipeline stages (build, compile, translate, execute, validate): Working
for all ops
Tests the 13 elementwise operations defined in TableGen (add, sub, mul, max, exp, log, sqrt, rsqrt,
tanh, abs, neg, relu, sigmoid)
Two grid configurations tested: 1x1 (single tile) and 2x2 (4 tiles) via parametrization
All single op tests are parameterized by dtype (bf16 and f32) but some f32 tests' results have error exceeding the threshold and are marked with xfal): [BUG] Wrong results for add/sub/log elementwise ops with f32 #254

See
https://github.com/tenstorrent/tt-lang/blob/bnorris/me2e-tests/test/me2e/README.md
for more details.

How to Test?

Part of the check-ttlang-all target. Also separately testable with:

source build/env/activate
pytest test/me2e/ -v              # All me2e tests
ninja -C build check-ttlang-me2e  # CMake target

CI:
https://github.com/tenstorrent/tt-lang/actions/runs/21126783459/job/60749650887

Checklist:

Self-reviewed (style, logic)
Added tests (227 new tests)
PR is focused (ME2E test framework only)

input) Introduces declarative test infrastructure for TTL-dialect MLIR: - op_specs.py: ComputeOpSpec/FusedOpSpec for single and fused kernels - config_specs.py: TestConfig for tile sizes, dtypes, buffer factors - ttl_builder.py: Programmatic TTL MLIR generation (reader/compute/writer) - compile_utils.py: Pass pipeline and C++ translation - runner.py: ttnn.generic_op execution and validation against PyTorch golden Supports 12 single ops, 8 fused op chains, 5 configurations (120 tests). Generated C++ sources written to build/test/middle_end/<op>/<config>/.

…structure: 1. Marks class for skip/xfail annotations on ops 2. Platform-aware skip markers (skip_config, only_config) 3. Automatic test metadata extraction for XML reporting 4. Structured grid/shape configuration sets (SMALL_GRIDS, etc.) 5. Input range constraints for domain-sensitive ops (sqrt, rsqrt) 6. Device caching across test session 7. Failure stage classification (compile/runtime/golden exceptions) 8. ID generation helpers (shape_str, torch_dtype_to_abbrev) 9. MLIR dump option (--dump-mlir) for debugging 10. PCC-based golden comparison (--check-pcc, compare_tensors) New files: - test_utils.py: Shared utilities, exceptions, PCC computation Updated files: - conftest.py: Platform detection, metadata hooks, CLI options - config_specs.py: Categorized grid shapes, make_config helper - op_specs.py: input_range field for ops with domain constraints - runner.py: Uses compare_tensors, custom exceptions, MLIR dump

…ize and minimize redundancies

- Rename test/e2e/builder/executor.py to ttnn_runner.py - Rename execute_* functions to run_* for consistency - Fix compute kernel compile-time args (CB indices were missing) - Fix runtime_args format (4D -> 3D list structure) - Skip execution stages blocked by TensorAccessorArgs codegen issue - Skip fused op translation (compute-only, no reader/writer threads) - Update README with blocking issue documentation

cd /home/bnorris/tt/tt-lang && source build/env/activate && TTLANG_DEBUG_KERNELS=1 timeout 90 pytest "test/me2e/test_compute_ops.py::test_compute[TestConfig(tile_h=32, tile_w=32, block_h=1, block_w=1, dtype=torch.bfloat16, num_tiles=1, buffer_factor=1, memory_layout=<MemoryLayout.INTERLEAVED: 'interleaved'>)-add]" -vsx

Make layout attribute generation configurable via E2EConfig.buffer_type (DRAM/L1) and E2EConfig.memory_layout (interleaved/sharded variants). Defaults preserved (DRAM + interleaved). Enables future tests with different memory configurations without modifying builder code.

… utilities (#249) ### What? Extract shared kernel execution logic into `python/ttl/kernel_runner.py` and move test utilities from `test/python/test_helpers.py` to `test/ttlang_test_utils.py` at the test root. ### Why? Kernel execution logic was duplicated between `CompiledTTNNKernel.__call__` and test code (and would be further replicated in the compiler me2e tests), making it difficult to maintain a single source of truth for building kernel descriptors and CB descriptors. Test utilities were also scattered in `test/python/test_helpers.py`, making them less accessible to lit tests and other test infrastructure. ### How? - Created `python/ttl/kernel_runner.py` with reusable functions (`build_kernel_descriptors`, `build_cb_descriptors`, `run_kernel_on_device`) for building kernel descriptors, CB descriptors, and executing kernels via `ttnn.generic_op`. - Moved `test/python/test_helpers.py` to `test/ttlang_test_utils.py` at the test root, making utilities accessible to both pytest and lit tests. - Refactored `CompiledTTNNKernel.__call__` to delegate to `run_kernel_on_device()` from `kernel_runner.py`, reducing code duplication. - Updated all test imports to use the new `ttlang_test_utils` location. ### How to Test? Run existing tests to verify no regressions (check-ttlang-all passes) Also tested in the me2e PR that depends on this one: #167 ### Checklist: * [x] Self-reviewed (style, logic) * [x] Added tests (or justified none needed) * [x] PR is small and focused (one task)

brnorris03 added 5 commits December 24, 2025 23:47

refactor to use class-based design to greatly reduce implementation s…

86e68bc

…ize and minimize redundancies

ulp-based errors

11b8114

add test for test utils

0418d1d

brnorris03 force-pushed the bnorris/me2e-tests branch from 62f1a40 to 0418d1d Compare December 25, 2025 18:43

brnorris03 added 7 commits December 25, 2025 11:51

fix ordering

d545163

register passes

9dca069

generate op tests from def file

08df5b5

add ability to test any custom compute ops

3d21f9f

reorganize a bit

7bcca3f

exclude e2e pytests from lit

987dc2d

brnorris03 changed the title ~~[test] pytest framework for different types of e2e (and me2e) tests~~ [test] Add e2e pytest framework Dec 26, 2025

brnorris03 changed the title ~~[test] Add e2e pytest framework~~ [test] Add e2e pytest framework for middle end (and DSL) Dec 26, 2025

brnorris03 mentioned this pull request Dec 26, 2025

[ttl] The one python bindings PR #159

Merged

3 tasks

brnorris03 added 13 commits December 26, 2025 20:14

Merge remote-tracking branch 'origin/main' into bnorris/me2e-tests

23ad7db

wip

c07780e

Merge remote-tracking branch 'origin/main' into bnorris/me2e-tests

630b7e4

rename to me2e, cleanup

4e0dcd8

remove references to system descriptors, other cleanup

8174d60

wip

67632a8

execute works, wrong results

862d019

binary ops pass

72cba29

custom mlir tests pass

3808fa2

all tests pass

dce66f7

update cmake and CI to run me2e test; a few leftover renamings

68206b0

update test/me2e/README.md

0dcf486

brnorris03 changed the title ~~[test] Add e2e pytest framework for middle end (and DSL)~~ [test] Add me2e pytest framework for middle end Jan 19, 2026

brnorris03 added 12 commits January 18, 2026 22:06

group tests by class

4252514

remove dead code

2bedd75

install prereq pytest-order in ci

7e2e722

more readable test names

0dbc43a

use double buffering by default

9802980

revert unrelated change

4da25b5

misc cleanup

7038828

refactor test utils to be able to reuse in me2e tests

c7ca92d

reuse common mlir utils in me2e tests

3d9e45f

fix python path in lit tests

abbd8f0

add test pythonpath in activate.in

b614c10

brnorris03 mentioned this pull request Jan 19, 2026

[nfc] Refactor to extract kernel execution logic and consolidate test utilities #249

Merged

3 tasks

brnorris03 added 2 commits January 20, 2026 09:50

Merge remote-tracking branch 'origin/main' into bnorris/me2e-tests

ff15400

post-merge fixes

10e6f9c

brnorris03 force-pushed the bnorris/me2e-tests branch from 8e02134 to 10e6f9c Compare January 20, 2026 20:37

brnorris03 added 4 commits January 20, 2026 13:24

disable pyc caching in pytests

430f4fc

add xfails for some f32 tests, issue #254

147d9d3

add more xfails for n150

b615f36

Merge remote-tracking branch 'origin/main' into bnorris/me2e-tests

796cfef

brnorris03 marked this pull request as ready for review January 20, 2026 23:02

brnorris03 requested a review from a team as a code owner January 20, 2026 23:02

brnorris03 added 2 commits January 20, 2026 15:13

use consistent names

21cfeb0

add ci xfails and use fixed random seed in ci

4d75323

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[test] Add me2e pytest framework for middle end #167

[test] Add me2e pytest framework for middle end #167

Uh oh!

brnorris03 commented Dec 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[test] Add me2e pytest framework for middle end #167

Are you sure you want to change the base?

[test] Add me2e pytest framework for middle end #167

Uh oh!

Conversation

brnorris03 commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What?

Why?

How?

How to Test?

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brnorris03 commented Dec 25, 2025 •

edited

Loading