Skip to content

Conversation

@brnorris03
Copy link
Contributor

@brnorris03 brnorris03 commented Dec 25, 2025

What?

New pytest-based ME2E (middle-end to end-to-end) test framework in test/me2e/
for testing TTL MLIR generation, compilation, and hardware execution.

Why?

  • Unified framework for testing TTL MLIR generation, compilation, and hardware
    execution
  • Declarative parametrized tests enable testing all ops with all configs via
    single test function
  • Auto-generates test classes from TTLElementwiseOps.def keeping tests in sync
    with dialect
  • ULP-based comparison utilities for hardware-accurate golden validation
  • Supports custom fused op tests via MLIR templates

How?

Two test approaches (both auto-generated from TTLElementwiseOps.def):

  1. Declarative parametrized tests (primary): Single test_compute_ops.py
    function parametrized over all operations and configurations. COMPUTE_OPS
    in op_specs.py is auto-generated from the .def file. All pipeline stages
    executed in one function via runner.py. Fast, clean, no artifacts.

  2. Class-based tests: Test classes auto-generated from .def file. Classes
    declare what to test (op name, arity, input range, tolerance) via class
    attributes; base classes handle how (MLIR generation, compilation,
    execution, validation). Saves intermediate (MLIR, CPP, results) artifacts
    for debugging, supports custom (fused) ops via MLIR templates.

Architecture:

  • E2ETestBase: Ordered 5-stage pipeline (build → compile → translate → execute
    → validate)
  • OpTestBase: Adds OP_STR, ARITY, INPUT_RANGE; auto-generates from
    .def file
  • FusedOpTestBase: Custom MLIR via get_mlir_template() override
    (examples in test/me2e/ops/test_fused.py)
  • Host configuration and execution reuses the shared kernel_runner
    module (python/ttl/kernel_runner.py)
  • Artifacts saved to build/test/me2e/<TestName>/ for debugging (class-based
    tests) or temporary directories (declarative tests).
flowchart TD
    subgraph Inputs
        DEF[.def file]
        MLIR[Custom MLIR]
    end

    DEF --> GEN[Generate: TestAdd, TestExp...]
    DEF --> DECL[Declarative: test_compute_ops]
    MLIR --> FUSED[FusedOpTestBase]

    GEN --> P1
    FUSED --> P1
    DECL --> P2

    subgraph P1[Class-based Pipeline]
        S1[build] --> S2[compile]
        S2 --> S3[translate]
        S3 --> S4[execute]
        S4 --> S5[validate]
    end

    subgraph P2[Declarative Pipeline]
        D1[build] --> D2[compile]
        D2 --> D3[translate]
        D3 --> D4[execute + validate]
    end

    S1 --> O1[module.mlir]
    S2 --> O2[compiled.mlir]
    S3 --> O3[*.cpp]
    S4 --> O4[result.pt]

    D4 --> O5[in-memory validation]
Loading

Current status:

  • All 5 pipeline stages (build, compile, translate, execute, validate): Working
    for all ops
  • Tests the 13 elementwise operations defined in TableGen (add, sub, mul, max, exp, log, sqrt, rsqrt,
    tanh, abs, neg, relu, sigmoid)
  • Two grid configurations tested: 1x1 (single tile) and 2x2 (4 tiles) via parametrization
  • All single op tests are parameterized by dtype (bf16 and f32) but some f32 tests' results have error exceeding the threshold and are marked with xfal): [BUG] Wrong results for add/sub/log elementwise ops with f32 #254

See
https://github.com/tenstorrent/tt-lang/blob/bnorris/me2e-tests/test/me2e/README.md
for more details.

How to Test?

Part of the check-ttlang-all target. Also separately testable with:

source build/env/activate
pytest test/me2e/ -v              # All me2e tests
ninja -C build check-ttlang-me2e  # CMake target

CI:
https://github.com/tenstorrent/tt-lang/actions/runs/21126783459/job/60749650887

Checklist:

  • Self-reviewed (style, logic)
  • Added tests (227 new tests)
  • PR is focused (ME2E test framework only)

input)

Introduces declarative test infrastructure for TTL-dialect MLIR:
- op_specs.py: ComputeOpSpec/FusedOpSpec for single and fused kernels
- config_specs.py: TestConfig for tile sizes, dtypes, buffer factors
- ttl_builder.py: Programmatic TTL MLIR generation (reader/compute/writer)
- compile_utils.py: Pass pipeline and C++ translation
- runner.py: ttnn.generic_op execution and validation against PyTorch golden

Supports 12 single ops, 8 fused op chains, 5 configurations (120 tests).
Generated C++ sources written to build/test/middle_end/<op>/<config>/.
…structure:

1. Marks class for skip/xfail annotations on ops
2. Platform-aware skip markers (skip_config, only_config)
3. Automatic test metadata extraction for XML reporting
4. Structured grid/shape configuration sets (SMALL_GRIDS, etc.)
5. Input range constraints for domain-sensitive ops (sqrt, rsqrt)
6. Device caching across test session
7. Failure stage classification (compile/runtime/golden exceptions)
8. ID generation helpers (shape_str, torch_dtype_to_abbrev)
9. MLIR dump option (--dump-mlir) for debugging
10. PCC-based golden comparison (--check-pcc, compare_tensors)

New files:
- test_utils.py: Shared utilities, exceptions, PCC computation

Updated files:
- conftest.py: Platform detection, metadata hooks, CLI options
- config_specs.py: Categorized grid shapes, make_config helper
- op_specs.py: input_range field for ops with domain constraints
- runner.py: Uses compare_tensors, custom exceptions, MLIR dump
- Rename test/e2e/builder/executor.py to ttnn_runner.py
- Rename execute_* functions to run_* for consistency
- Fix compute kernel compile-time args (CB indices were missing)
- Fix runtime_args format (4D -> 3D list structure)
- Skip execution stages blocked by TensorAccessorArgs codegen issue
- Skip fused op translation (compute-only, no reader/writer threads)
- Update README with blocking issue documentation
@brnorris03 brnorris03 changed the title [test] pytest framework for different types of e2e (and me2e) tests [test] Add e2e pytest framework Dec 26, 2025
@brnorris03 brnorris03 changed the title [test] Add e2e pytest framework [test] Add e2e pytest framework for middle end (and DSL) Dec 26, 2025
@brnorris03 brnorris03 mentioned this pull request Dec 26, 2025
3 tasks
cd /home/bnorris/tt/tt-lang && source build/env/activate && TTLANG_DEBUG_KERNELS=1 timeout 90 pytest "test/me2e/test_compute_ops.py::test_compute[TestConfig(tile_h=32, tile_w=32, block_h=1, block_w=1, dtype=torch.bfloat16, num_tiles=1, buffer_factor=1, memory_layout=<MemoryLayout.INTERLEAVED: 'interleaved'>)-add]" -vsx
@brnorris03 brnorris03 changed the title [test] Add e2e pytest framework for middle end (and DSL) [test] Add me2e pytest framework for middle end Jan 19, 2026
brnorris03 added a commit that referenced this pull request Jan 20, 2026
… utilities (#249)

### What?
Extract shared kernel execution logic into `python/ttl/kernel_runner.py`
and move test utilities from `test/python/test_helpers.py` to
`test/ttlang_test_utils.py` at the test root.

### Why?
Kernel execution logic was duplicated between
`CompiledTTNNKernel.__call__` and test code (and would be further
replicated in the compiler me2e tests), making it difficult to maintain
a single source of truth for building kernel descriptors and CB
descriptors. Test utilities were also scattered in
`test/python/test_helpers.py`, making them less accessible to lit tests
and other test infrastructure.

### How?
- Created `python/ttl/kernel_runner.py` with reusable functions
(`build_kernel_descriptors`, `build_cb_descriptors`,
`run_kernel_on_device`) for building kernel descriptors, CB descriptors,
and executing kernels via `ttnn.generic_op`.
- Moved `test/python/test_helpers.py` to `test/ttlang_test_utils.py` at
the test root, making utilities accessible to both pytest and lit tests.
- Refactored `CompiledTTNNKernel.__call__` to delegate to
`run_kernel_on_device()` from `kernel_runner.py`, reducing code
duplication.
- Updated all test imports to use the new `ttlang_test_utils` location.

### How to Test?
Run existing tests to verify no regressions (check-ttlang-all passes)
Also tested in the me2e PR that depends on this one: #167

### Checklist:
*   [x] Self-reviewed (style, logic)
*   [x] Added tests (or justified none needed)
*   [x] PR is small and focused (one task)
@brnorris03 brnorris03 marked this pull request as ready for review January 20, 2026 23:02
@brnorris03 brnorris03 requested a review from a team as a code owner January 20, 2026 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants