Implement tensor_slice op and dynamic tensor indexing #213

zoecarver · 2026-01-06T16:37:24Z

Adds support for tensor[row, col] syntax in TTL kernels to access specific tiles within multi-tile tensors. Previously, tensor indexing was restricted to [0, 0].

Add TensorSliceType and TensorSliceOp to the TTL dialect for representing tile-indexed tensor views
Update Python DSL to emit ttl.tensor_slice ops when tensor subscript syntax is used
Add lowering in ConvertTTLToTTKernel to compute correct linear tile offsets (row * num_cols + col) for NOC read/write operations
Add simple_tensor_slice.py lit test and pytest_tensor_slice.py parameterized test covering 1x1 through 16x16 tile shapes
Based on Fix multi-tile CB addressing and add elementwise shape sweep tests #212

zoecarver · 2026-01-06T16:40:24Z

test/python/test_tensor_slice_indices.py .................................................................................... [ 18%]
............................................................................................................................. [ 46%]
................sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 74%]
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss          [100%]

================================================= 225 passed, 225 skipped in 18.08s =================================================

brnorris03

This fails for me on qb (some mlir lit tests and it seems to not have the ttnn build fixes), is it ready for review? Tried my usual clean build + test with a pre-built tt-mlir that includes ttnn jit:

rm -rf build; deactivate; cmake -GNinja -B build -DTTMLIR_BUILD_DIR=$HOME/tt/tt-mlir/build-ttlang && source build/env/activate &&  ninja -C build && time ninja -C build check-ttlang-alltt-lang-cursor git:(zoecarver/dynamic-tensor-subscript) ✗ rm -rf build; deactivate; cmake -GNinja -B build -DTTMLIR_BUILD_DIR=$HOME/tt/tt-mlir/build-ttlang && source build/env/activate &&  ninja -C build && time ninja -C build check-ttlang-all

got [1/1] Skipping ttlang Python lit tests (TTNN not available) and

Failed Tests (8):
  TTLang :: ttlang/Conversion/TTLToTTKernel/compute_fused_chain.mlir
  TTLang :: ttlang/Conversion/TTLToTTKernel/dma_single_core.mlir
  TTLang :: ttlang/Translate/TTLToCpp/compute_fused_chain_to_cpp.mlir
  TTLang :: ttlang/Translate/TTLToCpp/compute_with_data_movement.mlir
  TTLang :: ttlang/Translate/TTLToCpp/dma_loop_multi_tile_nontrivial_cb.mlir
  TTLang :: ttlang/Translate/TTLToCpp/dma_multi_tile_batched_in_user_loop.mlir
  TTLang :: ttlang/Translate/TTLToCpp/dma_multi_tile_read.mlir
  TTLang :: ttlang/Translate/TTLToCpp/dma_multi_tile_same_layout_different_cb.mlir

zoecarver · 2026-01-06T18:35:15Z

This is ready for review. I will look into the test failures.

brnorris03 · 2026-01-08T15:40:38Z

High level question first -- why is the tensor dialect not appropriate to use for this (necessitating custom ops)? For example, tensor.extract_slice for extracting slices.

…unch of dead code

lib/Dialect/TTL/IR/TTLOps.cpp

arichinsTT · 2026-01-08T21:39:25Z

include/ttlang/Dialect/TTL/IR/TTLOps.td

+    %c0 = arith.constant 0 : index
+    %c1 = arith.constant 1 : index
+    %slice = ttl.tensor_slice %tensor[%c0, %c1]
+             : tensor<64x64xbf16, #layout> -> !ttl.tensor_slice<tensor<64x64xbf16, #layout>>


can we have the return type also be a tensor of the new resulting slice size, in this case it should be 32x32xbf16

I don't think theres a benefit to a tensor_slice value type, if a tensor is a result of a slice, it is on the optimization to check the parent of the value. otherwise this adds unnecessary baggage

brnorris03

looks good in general, thank you

test/python/test_tensor_slice.py

test/python/pytest_tensor_slice.py

include/ttlang/Dialect/TTL/IR/TTLOpsTypes.td

test/python/test_tensor_slice.py

brnorris03 · 2026-01-08T21:42:20Z

python/ttlang/_src/ttl_ast.py

+    def visit_Subscript(self, node):
+        """Handle tensor[row, col] indexing for TTL tensor slices."""
+        tbl = self._var_exists(node.value.id)
+        if not tbl:
+            self._raise_error(node, f"Unknown variable: {node.value.id}")
+
+        tensor = tbl[node.value.id]
+        if not isinstance(getattr(tensor, "type", None), RankedTensorType):
+            self._raise_error(node, "TTL only supports subscripting tensors")
+
+        if isinstance(node.slice, ast.Tuple):
+            indices = [self._build_index_value(elt) for elt in node.slice.elts]
+        else:
+            indices = [self._build_index_value(node.slice)]
+
+        return (tensor, indices)


What are the constraints on the subscripts (row, col) -- can they be arbitrary expressions, e.g., calls to range? Is node.slice a python slice object?

It is an AST object, not a python object. We don't handle ranges today.

Currently supported:

Integer literals: tensor[0, 1] → ast.Constant nodes → arith.ConstantOp with IndexType

Loop induction variables: tensor[r, c] where r, c come from for r in range(N) → already IndexType from scf.ForOp

Arithmetic expressions: tensor[i+1, j*2] → visits BinOp, produces i64, then casts to index via IndexCastOp

What happens with a slice like tensor[0:2, 0:3]?

node.slice would be an ast.Tuple containing two ast.Slice objects

_build_index_value is called on each ast.Slice

ast.Slice is not ast.Constant, so it calls self.visit(node) on the Slice

No visit_Slice method exists

ast.Slice is not in supported_nodes (base_ast.py:107-128)

Error: NotImplementedError("visit Slice not supported")

include/ttlang/Dialect/TTL/IR/TTLOps.td

arichinsTT · 2026-01-08T21:49:27Z

High level question first -- why is the tensor dialect not appropriate to use for this (necessitating custom ops)? For example, tensor.extract_slice for extracting slices.

I vote yes, especially if we plan on utilizing memref and bufferization dialects or something similar, but it def depends on the lowering after, and what lowers into ttkernel

arichinsTT

no tensor_slice value type, have it return reduced tensor, which should remove a lot of cases. I think it is worth while to have handling for multiple tile slices from the get go.

python/ttlang/operators.py

zoecarver · 2026-01-09T15:12:25Z

High level question first -- why is the tensor dialect not appropriate to use for this (necessitating custom ops)? For example, tensor.extract_slice for extracting slices.
I vote yes, especially if we plan on utilizing memref and bufferization dialects or something similar, but it def depends on the lowering after, and what lowers into ttkernel

Based on offline discussion, this is going to add a lot of logic and checking in both lowering and in building the static + dynamic offsets in python. You can see the diff here: https://github.com/tenstorrent/tt-lang/compare/zoecarver/dynamic-tensor-subscript...zoecarver/ttl-tensor-slice-to-mlir-tensor-extract?expand=1

My recommendation is to land this, and then investigate how to move to tensor.extract_slice after the fact, maybe there is a cleaner way to map it. Is that OK with you, Alex?

Regarding your comment about removing TensorSliceType and using tensor directly, the biggest inconsistency I see with that is the layout. If we use tensor type directly, it will have to point to a layout with a different shape:

  #ttnn_layout = #ttnn.ttnn_layout<..., memref<1x4x!ttcore.tile<32x32, bf16>, #l1>, ...>
                                         ^^^^ shape encoded here

I don't think this is the end of the world, it won't affect lowering today, but it might be confusing in the future if we wanted to eg validate that tensor shape == layout shape or use the layout to make some decision. I'm generally against this kind of defensive design, but I also think the separate type adds some semantic clarity by explicitly saying "this is a slice".

Given this, what do you think? Do you still want me to remove TensorSliceType and just use the tensor type? I'm happy either way.

zoecarver · 2026-01-09T16:12:17Z

Oops accidentally pushed to the wrong branch 🫣

All comments have been addressed or responded to. Thank you!

brnorris03

lgtm, thank you for updating!

include/ttlang/Dialect/TTL/IR/TTLOps.td

arichinsTT · 2026-01-09T21:50:02Z

test/ttlang/Translate/TTLToCpp/compute_with_data_movement.mlir


  // Copy A to CB0
-  %xf_a = ttl.copy %a, %cb0 : (tensor<64x64xf32, #layout>, !ttl.cb<[2, 2], f32, 2>) -> !ttl.transfer_handle<read>
+  %slice_a = ttl.tensor_slice %a[%c0, %c0] : tensor<64x64xf32, #layout> -> tensor<64x64xf32, #layout>


this slice is capturing more than a single tile, based on the output type and the copy in a 2x2 cb, is this allowed with index slicing?

Yes, we should be allowed to copy 2x2 tiles at a time if the CB shape is 2x2.

seems like range based slicing is a later todo

phizalev-TT · 2026-01-11T04:11:30Z

When testing on QB with CB shape (>1, >1) it hangs.

arichinsTT

thanks for discussion! looks good

arichinsTT · 2026-01-12T18:45:45Z

test/ttlang/Translate/TTLToCpp/compute_with_data_movement.mlir


  // Copy A to CB0
-  %xf_a = ttl.copy %a, %cb0 : (tensor<64x64xf32, #layout>, !ttl.cb<[2, 2], f32, 2>) -> !ttl.transfer_handle<read>
+  %slice_a = ttl.tensor_slice %a[%c0, %c0] : tensor<64x64xf32, #layout> -> tensor<64x64xf32, #layout>


seems like range based slicing is a later todo

zoecarver requested a review from a team as a code owner January 6, 2026 16:37

brnorris03 reviewed Jan 6, 2026

View reviewed changes

zoecarver force-pushed the zoecarver/sweep-over-shapes branch 6 times, most recently from 7367541 to 55b68d2 Compare January 8, 2026 15:27

Base automatically changed from zoecarver/sweep-over-shapes to main January 8, 2026 15:39

zoecarver force-pushed the zoecarver/dynamic-tensor-subscript branch from d601524 to 821b6e4 Compare January 8, 2026 16:12

zoecarver added 5 commits January 8, 2026 11:12

Implement tensor_slice op and dynamic tensor indexing

57ae232

pre commit

7dbb95a

integrate CB shape > 1x1 logic with explicit loop logic for tensor_slice

7d56bd4

require copy src/dst to be tensor slice not tensor itself; delete a b…

e19c236

…unch of dead code

Fix python after the import pre commit broke it

e9061a4

zoecarver force-pushed the zoecarver/dynamic-tensor-subscript branch from 821b6e4 to e9061a4 Compare January 8, 2026 16:17

pre commit

867819b

brnorris03 reviewed Jan 8, 2026

View reviewed changes

lib/Dialect/TTL/IR/TTLOps.cpp Outdated Show resolved Hide resolved

zoecarver force-pushed the zoecarver/dynamic-tensor-subscript branch from a141858 to e925932 Compare January 8, 2026 21:20

zoecarver added 2 commits January 8, 2026 16:23

replace verifier with custom type

cc70311

fix a few tests that seem to have been broken by earlier changes

e91abda

zoecarver force-pushed the zoecarver/dynamic-tensor-subscript branch from e925932 to e91abda Compare January 8, 2026 21:29

arichinsTT reviewed Jan 8, 2026

View reviewed changes

brnorris03 reviewed Jan 8, 2026

View reviewed changes

arichinsTT reviewed Jan 8, 2026

View reviewed changes

include/ttlang/Dialect/TTL/IR/TTLOps.td Show resolved Hide resolved

arichinsTT requested changes Jan 8, 2026

View reviewed changes

arichinsTT reviewed Jan 8, 2026

View reviewed changes

python/ttlang/operators.py Show resolved Hide resolved

arichinsTT reviewed Jan 8, 2026

View reviewed changes

python/ttlang/operators.py Show resolved Hide resolved

review comments

6d47e71

brnorris03 approved these changes Jan 9, 2026

View reviewed changes

Remove tensor_slice type and use tensor with shape subset directly

0208c7a

arichinsTT reviewed Jan 9, 2026

View reviewed changes

include/ttlang/Dialect/TTL/IR/TTLOps.td Show resolved Hide resolved

arichinsTT reviewed Jan 9, 2026

View reviewed changes

clarify copy args

a3a5d90

zoecarver force-pushed the zoecarver/dynamic-tensor-subscript branch from 049da97 to a3a5d90 Compare January 9, 2026 22:15

arichinsTT approved these changes Jan 12, 2026

View reviewed changes

zoecarver merged commit 3ddf92f into main Jan 12, 2026
5 checks passed

zoecarver deleted the zoecarver/dynamic-tensor-subscript branch January 12, 2026 18:50

Implement tensor_slice op and dynamic tensor indexing #213

Implement tensor_slice op and dynamic tensor indexing #213

Uh oh!

Conversation

zoecarver commented Jan 6, 2026

Uh oh!

zoecarver commented Jan 6, 2026

Uh oh!

brnorris03 left a comment

Choose a reason for hiding this comment

Uh oh!

zoecarver commented Jan 6, 2026

Uh oh!

brnorris03 commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

arichinsTT Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

arichinsTT Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brnorris03 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brnorris03 Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

zoecarver Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arichinsTT commented Jan 8, 2026

Uh oh!

arichinsTT left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zoecarver commented Jan 9, 2026

Uh oh!

zoecarver commented Jan 9, 2026

Uh oh!

brnorris03 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arichinsTT Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

zoecarver Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

arichinsTT Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

phizalev-TT commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arichinsTT left a comment

Choose a reason for hiding this comment

Uh oh!

arichinsTT Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

brnorris03 commented Jan 8, 2026 •

edited

Loading

arichinsTT Jan 8, 2026 •

edited

Loading

phizalev-TT commented Jan 11, 2026 •

edited

Loading