[Discussion] `ttl` dialect proposal (plan) #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

brnorris03 wants to merge 49 commits into main from bnorris/ttl-dialect-plan

Contributor

brnorris03 commented Dec 5, 2025 •

edited

Loading

This draft PR is solely for discussion on a proposed ttl dialect (not intended to merge). See TTL_Dialect_Plan.md

brnorris03 added 12 commits

December 4, 2025 20:31


          ttl dialect initial plan

66984b6


          add transform dialect details, location tracking, granularity

bfe7e37


          add more details

d02eec9


          add runtime integration

d7fd8e5


          fix inconsistencies; add details

6e92177


          add tt-lang spec draft

883b44b


          misc edits

d2c058c


          add op mappings

b911e0b


          draft distributed tensor type

188af54


          fixes

c013f15


          add more dialect details

7d683bf


          expand TensorAccessor details

26ae903

brnorris03 force-pushed the bnorris/ttl-dialect-plan branch from 11769f9 to 4c4f4d1 Compare

December 5, 2025 17:38


          remove unrelated doc

16a5f0e

brnorris03 force-pushed the bnorris/ttl-dialect-plan branch from 4c4f4d1 to 16a5f0e Compare

December 5, 2025 17:38

brnorris03 added 10 commits

December 5, 2025 09:48


          more TensorAccessor fixes

7fbb3de


          remove explicit tile attribute, not needed

67da2a9


          fix various inconsistencies


          fix tablegen issues

8fe92c6


          update coreop and gridop to align with spec

022eb5e


          add slice attr; add more pipe semantics in description

e55d235


          [skip ci] fix formatting

f096cab


          add clarifications

7afb1f6


          formatting fixes

7e9c96d


          fix tablegen issues

004ba4a

phizalev-TT reviewed

View reviewed changes

docs/TTL_Dialect_Plan.md Outdated

    
              ```

              Python Kernel → Python AST → TTL Dialect → TTL Passes → TTKernel → ConvertTTKernelToEmitC → C++ Source

                                                  ↓                                                            ↓

                                       Validation, Synchronization,                                    C++ Compiler

Contributor

phizalev-TT Dec 5, 2025

With C++ source plus metadata about input/output tensors, CBs etc we go directly to TT-NN generic operation (it internally compiles and runs C++):

https://github.com/tenstorrent/tt-metal/blob/0ae4611214adb349a8621a46605943e0dac0e82b/ttnn/cpp/ttnn/operations/generic/generic_op.hpp#L20

Contributor Author

brnorris03 Dec 5, 2025

Great point! I am changing the runtime integration section completely and will update the workflows here.

phizalev-TT reviewed

View reviewed changes

docs/TTL_Dialect_Plan.md Outdated

    
                  // Calculate total elements for TTKernel CB conversion

                  int64_t getTotalElements() const {

                    int64_t elementsPerBlock = std::accumulate(

                      getShape().begin(), getShape().end(), 1, std::multiplies<int64_t>());

Contributor

phizalev-TT Dec 5, 2025

Probably can re-use getElementsPerBlock below.

phizalev-TT reviewed

View reviewed changes

docs/TTL_Dialect_Plan.md Outdated Show resolved Hide resolved

phizalev-TT reviewed

View reviewed changes

docs/TTL_Dialect_Plan.md Outdated

    
                  Note: TTKernel doesn't support per-transaction waits. All ttl.wait

                  operations lower to global DMA barriers (`ttkernel.noc_async_read_barrier`

                  or `ttkernel.noc_async_write_barrier`). This type exists for ordering

                  and future optimization opportunities.

Contributor

phizalev-TT Dec 5, 2025

In case of transfer from a pipe wait will likely lower into waiting on a semaphore.

phizalev-TT reviewed

View reviewed changes

docs/TTL_Dialect_Plan.md Outdated Show resolved Hide resolved

brnorris03 commented

View reviewed changes

docs/TTL_Dialect_Plan.md Outdated

Contributor Author

brnorris03 Dec 7, 2025

This document is no longer being edited, it was split into more manageable parts in the docs/ttl directory.

brnorris03 added 2 commits

December 6, 2025 17:29


          [skip ci] Rename TTL docs with numbered prefixes for correct ordering

01a3509


          add section on pipes

a4eb130

brnorris03 force-pushed the bnorris/ttl-dialect-plan branch 7 times, most recently from 052d517 to a414bb1 Compare

December 9, 2025 23:24


          add Pipenet, other fixes

acca0ba

brnorris03 force-pushed the bnorris/ttl-dialect-plan branch from a414bb1 to acca0ba Compare

December 9, 2025 23:41

brnorris03 added 3 commits

December 9, 2025 16:13


          misc fixes

2cbe307


          remove TTLVerifierPass (AI invention)

b1a432d


          remove custom layout attr, use ttnn::LayoutAttr; fix more inconsisten…

51686b7

…cies, update docs

brnorris03 force-pushed the bnorris/ttl-dialect-plan branch from 1f3bd1d to 51686b7 Compare

December 10, 2025 01:34

brnorris03 added 7 commits

December 9, 2025 19:36


          clarify PipeNet lowering; other fixes

afe99bb


          reuse ttcore and ttnn attributes where appropriate

abb5b68


          minor fixes

0da28e2


          update to match current language spec

102b6d9


          Merge remote-tracking branch 'origin/main' into bnorris/ttl-dialect-plan

abeb7e2


          [skip ci] rename ttl.xf to ttl.transfer_handle

3d22048


          remove explicit ttl.block* since ttl implicitly operates on blocks. A…

6f40799

…dd table of python ttl -> ttl dialect mapping

brnorris03 force-pushed the bnorris/ttl-dialect-plan branch from 8a738d0 to 6f40799 Compare

December 15, 2025 16:06

phizalev-TT reviewed

View reviewed changes

docs/ttl/02_TTL_Type_System.md

    
                let summary = "Handle for asynchronous transfer with transaction ID tracking";

                let description = [{

                  Transfer handle for DMA operations that maps to a TTKernel transaction ID (TRID).

                  Each ttl.copy operation receives a unique TRID (0-15), and ttl.wait operations

Contributor

phizalev-TT Dec 17, 2025

What function in TTKernel returns trid?

Contributor Author

brnorris03 Dec 17, 2025

None, compiler must generate it.

Contributor

phizalev-TT Dec 18, 2025

Is there example code?

docs/ttl/02_TTL_Type_System.md Outdated

    
                  Arity requirement: The dst_range tuple must have the same arity as the

                  grid rank to prevent ambiguity. For a 2D grid (grid_x, grid_y), both dimensions

                  must be specified explicitly. Use slice(x, x+1) for a single core in that dimension.

Contributor

phizalev-TT Dec 17, 2025

I think the language spec has a lesser constraint. Pipes within the same pipe net must have the same dimensionality. But this dimensionality can be arbitrary since we have this ability with grid_size and core functions. For example there can be 1D pipe net defined within a 2D grid.

Contributor Author

brnorris03 Dec 17, 2025

I will update to match.

docs/ttl/02_TTL_Type_System.md

    
                  Runtime representation: PipeNet carries no runtime data. During lowering to TTKernel,

                  PipeNet operations are expanded and removed:

                  - ttl.create_pipenet %pipe1, %pipe2, ... → stores pipe list in operation operands

Contributor

phizalev-TT Dec 17, 2025

I wonder if it is worth materializing each pipe description here. In most cases the pipe list will be formed with Python list comprehension. Maybe we just capture this comprehension's loop nest. But maybe not in MVP.

Contributor Author

brnorris03 Dec 17, 2025

Not sure, this is just one possibility, I think it's probably easier to use a container (tensor) for storing the pipes.

docs/ttl/02_TTL_Type_System.md

    
              TTL-specific attributes are defined below:

              ```tablegen

              def TTL_SliceAttr : AttrDef<TTL_Dialect, "Slice"> {

Contributor

phizalev-TT Dec 17, 2025

Do we have representation for slicing into tensor accessor?

Contributor Author

brnorris03 Dec 17, 2025

Not at the moment, but ttnn tensors do support slice. We may need an extra op (that we define) to extract slices from the tensor accessor before it can be used as an arg in an other op.

Contributor

phizalev-TT Dec 18, 2025

Yes, TT-NN does have slicing, but I am referring to tensor slicing with ttl.copy. I guess we need a way to convert slices expression to shard id/page id in noc_async_xxx_shard/page.

docs/ttl/02_TTL_Type_System.md Outdated

    
              // TTL IR (%tensor : tensor<..., #ttl.tensor_encoding<DeviceDRAM,

              //                      #ttl.layout<sharded, grid=[2,2]>>>)

              %accessor = ttl.tensor_accessor %tensor

              %xf = ttl.copy %accessor[%shard_id], %cb

Contributor

phizalev-TT Dec 17, 2025

Where this %shard_id comes from? Do we convert slices into it?

Contributor Author

brnorris03 Dec 17, 2025

That's the high-level idea, but not completely sure about the syntax yet.

Contributor Author

brnorris03 Dec 18, 2025

So probably not the direct indexing like above, but updated to the more MLIR-typical (but again, different syntax can be implemented as needed/wanted):

// TTL IR (After Python AST Compilation)
// Python: shard_id = ttl.core(dim=1)
//         xf = ttl.copy(a[shard_id], a_blk)

// Tensor accessor wraps the tensor with its layout metadata
%a_accessor = ttl.tensor_accessor %a 
    : tensor<64x64xf32, #ttl.tensor_encoding<DeviceDRAM, #ttl.layout<sharded, grid=[2,2]>>>
    -> !ttl.accessor<tensor<64x64xf32, #ttl.tensor_encoding<DeviceDRAM, #ttl.layout<sharded, grid=[2,2]>>>>

// Core coordinate flattened to 1D (0-3 for 2x2 grid)
%shard_id = ttl.core {dims = 1} : index

// Reserve CB slot
%a_blk = ttl.cb_reserve %a_cb : !ttl.circular_buffer<[1,1], !ttcore.tile<32x32,f32>, 2>
    -> tensor<1x1x!ttcore.tile<32x32,f32>, #ttl.tensor_encoding<L1, #ttl.layout<tiled>>>

// Copy from accessor slice to CB block
// Indices are explicit operands; direction inferred from operand types
%xf_a = ttl.copy 
    from %a_accessor at [%shard_id] 
    to %a_blk 
    : !ttl.accessor<...>, index -> !ttl.transfer_handle

ttl.wait %xf_a : !ttl.transfer_handle
ttl.cb_push %a_cb, %a_blk : !ttl.circular_buffer<...>, tensor<...>

brnorris03 added 4 commits

December 17, 2025 19:04


          Merge remote-tracking branch 'origin/main' into bnorris/ttl-dialect-plan

1b8f3fc


          add more detail on the first MLIR stage

5ba66f3


          address comments

e195ee9


          updates and status

c53cfcc

brnorris03 force-pushed the bnorris/ttl-dialect-plan branch from 0cbcd6c to c53cfcc Compare

December 18, 2025 03:50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet