Skip to content

Conversation

@brnorris03
Copy link
Contributor

What?

Adds layout-aware transfer optimization to TTL-to-TTKernel lowering. The lowering analyzes tensor layouts and selects optimal DMA strategies based on data contiguity.

Why?

Block transfers (noc_async_read/noc_async_write) are more efficient than tile-by-tile transfers for row-major data. This enables faster transfers when data is contiguous in memory.

How?

Introduces a ContiguityLevel classification:

The lowering inspects TTNNLayoutAttr to determine if the tensor uses tiles or scalar elements, then dispatches to the appropriate transfer strategy.

Deferred

How to Test?

ninja -C build check-ttlang

Checklist:

  • Self-reviewed (style, logic)
  • Added tests (documents optimization levels, exercises tile and block paths)
  • PR is small and focused (one task)

@brnorris03 brnorris03 force-pushed the bnorris/block-transfers branch 2 times, most recently from 6abc94d to 75da024 Compare December 25, 2025 05:10
@brnorris03 brnorris03 force-pushed the bnorris/ttl-dm-kernel-lowering-fuse-sibling-loops branch from 7be41b3 to 5429a7c Compare December 25, 2025 06:21
Analyzes tensor layouts to select optimal DMA strategies:

- FullyContiguous (row-major + interleaved): single noc_async_read/write
- RowContiguous (row-major + sharded): per-row transfers (TODO: #118)
- TileContiguous (tiled): per-tile noc_async_read_tile (current default)

New files:
- LayoutUtils.h/cpp: ContiguityLevel enum, analyzeLayoutContiguity()
- block_transfers.mlir: tests for tiled and row-major layouts

The lowering inspects TTNNLayoutAttr to determine if tensors use tiles
or scalar elements, then dispatches to the appropriate transfer strategy.
@brnorris03 brnorris03 force-pushed the bnorris/block-transfers branch from 75da024 to 1ad773a Compare December 25, 2025 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants