[Moore] Introduce Mem2Reg to eliminate local variables #7082

hailongSun2000 · 2024-05-23T07:02:42Z

Please just view the `MooreOps.cpp` file to check the related implementation.

First

module Foo;
  int x, y;
  always_comb begin
    int a;
    a = x + 1;
    y = a;
  end
endmodule

/home/phoenix/work/HDLBits/test01.sv:2:7: error: all variables: %0 = "moore.variable"() <{name = "x"}> : () -> !moore.i32
  int x, y;
      ^
/home/phoenix/work/HDLBits/test01.sv:2:10: error: all variables: %1 = "moore.variable"() <{name = "y"}> : () -> !moore.i32
  int x, y;
         ^
/home/phoenix/work/HDLBits/test01.sv:4:9: error: all variables: %2 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    int a;
        ^
/home/phoenix/work/HDLBits/test01.sv:4:9: error: local varialbes: %2 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    int a;
        ^
/home/phoenix/work/HDLBits/test01.sv:5:5: error: blocking assignments: %2 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    a = x + 1;
    ^
/home/phoenix/work/HDLBits/test01.sv:6:5: error: blocking assignments: %2 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    y = a;
    ^
module {
  moore.module @Foo {
    %x = moore.variable  : !moore.i32
    %y = moore.variable  : !moore.i32
    moore.procedure always_comb {
      %a = moore.variable  : !moore.i32
      %0 = moore.constant 1 : !moore.i32
      %1 = moore.add %x, %0 : !moore.i32
      moore.blocking_assign %a, %1 : !moore.i32
      moore.blocking_assign %y, %a : !moore.i32
    }
  }
}

Second

module Foo;
  int x, y;
  always_comb begin
    int a;
    a = x + 1;
   // y = a;
  end
endmodule

/home/phoenix/work/HDLBits/test01.sv:2:7: error: all variables: %0 = "moore.variable"() <{name = "x"}> : () -> !moore.i32
  int x, y;
      ^
/home/phoenix/work/HDLBits/test01.sv:2:10: error: all variables: %1 = "moore.variable"() <{name = "y"}> : () -> !moore.i32
  int x, y;
         ^
/home/phoenix/work/HDLBits/test01.sv:4:9: error: all variables: %2 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    int a;
        ^
/home/phoenix/work/HDLBits/test01.sv:4:9: error: local varialbes: %2 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    int a;
        ^
/home/phoenix/work/HDLBits/test01.sv:5:5: error: blocking assignments: %2 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    a = x + 1;
    ^
/home/phoenix/work/HDLBits/test01.sv:4:9: error: slot: %3 = "moore.variable"() <{name = "a"}> : () -> !moore.i32
    int a;
        ^
/home/phoenix/work/HDLBits/test01.sv:2:7: error: all variables: %0 = "moore.variable"() <{name = "x"}> : () -> !moore.i32
  int x, y;
      ^
/home/phoenix/work/HDLBits/test01.sv:2:10: error: all variables: %1 = "moore.variable"() <{name = "y"}> : () -> !moore.i32
  int x, y;
         ^
module {
  moore.module @Foo {
    %x = moore.variable  : !moore.i32
    %y = moore.variable  : !moore.i32
    moore.procedure always_comb {
      %0 = moore.constant 1 : !moore.i32
      %1 = moore.add %x, %0 : !moore.i32
    }
  }
}

hailongSun2000 · 2024-05-23T07:19:45Z

I'm not sure the changes(like a is substituted with x + 1) occur at which stage. Thanks for any help in advance ❤️ .

fabianschuiki · 2024-05-23T21:38:28Z

I've been digging into this a bit to see how Mem2Reg does its magic and why this isn't promoted. I've looked at the following:

func.func @Foo() -> !moore.i32 {
  %x = moore.variable : !moore.i32
  %0 = moore.constant 42 : !moore.i32
  moore.blocking_assign %x, %0 : !moore.i32
  return %x : !moore.i32
}

My trace of Mem2Reg:

Collect %x = moore.variable into the initial set of allocators to promote
Go through the allocators to promote
Call computeInfo of MemorySlotPromotionAnalyzer on that %x variable
Call computeBlockingUses for %x
Visit all users of %x and make sure that they are all promotable
This visits func.return %x, which is not promotable, and aborts

So the problem is that we're using the variable %x directly in the return. This is definitely a design smell in the Moore dialect at the moment: we don't have any op that represents reading from a variable. Most other IRs have dedicated load and store ops to interact with variables, and the only operations that are ever performed on a variable are loads and stores. For example, in MemRef you'd have something like this:

func.func @Foo() -> i32 {
  %x = memref.alloca() : memref<i32>
  %0 = arith.constant 42 : i32
  memref.store %x, %0[] : memref<i32>
  %1 = memref.load %x[] : memref<i32>
  return %1 : i32
}

Instead of having variable/alloca directly return an i32, the MemRef dialect returns a reference type memref<i32>, which indicates that you're not dealing with an i32 value itself, but rather with a variable/memory reference that contains an i32. If you want to change %x, you have to use a memref.store. And if you want to know the value of %x, you have to use a memref.load to read its value.

We should probably do the same thing in the Moore dialect: moore.variable should probably return a !moore.ref<i32> type instead of i32 directly. moore.blocking_assign would then require the LHS to be a ref type and act like a memref.store. And we'd need something like moore.read_var that takes a ref and returns the current value, like a memref.load. (I think we already have a moore.read_lvalue that signals a value read, just not with a ref/lvalue type wrapper yet.)

hailongSun2000 · 2024-05-24T03:48:21Z

I referred to llvm. load and llvm. store, their types are i32 or i* without a wrapper. I think Mem2Reg pass is aborted due to %x that doesn't be substituted with %0. In other words, lack of the similar with llvm.load/memref.load.

hailongSun2000 · 2024-05-24T05:45:07Z

I'm not sure whether can we use moore.read_lvalue to replace directly return a variable. Like the following image

I'll attempt to verify my thoughts.

Moxinilian · 2024-05-24T07:47:11Z

Hey! I designed the mem2reg pass. Fabian’s analysis is correct, the pointer must only be used by promotable operations for any promotion to happen. In particular, returning the pointer is not allowed (as it would escape). You need a dedicated operation to read from the value. In the LLVM implementation, the slots are pointed to via an llvm.ptr, not an element value directly.

hailongSun2000 · 2024-05-24T08:06:20Z

Thanks @Moxinilian ❤️ ! Whether can I understand that we should use dedicated ops like llvm. load to read and llvm,store to write the value and a type wrapper like memref<i32> rather than i32? We need dedicated ops to read and write explicitly right?

hailongSun2000 · 2024-05-24T09:04:22Z

@fabianschuiki 🎉

module Foo;
  int x, y, z;
  always_comb begin
    int a;
    a = x + 1;
    y = a;
    a = a + 1;
    z = a;
  end
endmodule

module {
  moore.module @Foo {
    %x = moore.variable  : !moore.i32
    %y = moore.variable  : !moore.i32
    %z = moore.variable  : !moore.i32
    moore.procedure always_comb {
      %0 = moore.read_lvalue %x : !moore.i32
      %1 = moore.constant 1 : !moore.i32
      %2 = moore.add %0, %1 : !moore.i32
      moore.blocking_assign %y, %2 : !moore.i32
      %3 = moore.constant 1 : !moore.i32
      %4 = moore.add %2, %3 : !moore.i32
      moore.blocking_assign %z, %4 : !moore.i32
    }
  }
}

But if a is the global/module-level variable, there is nothing to do(slot promotion is aborted).

Moxinilian · 2024-05-24T09:37:51Z

I would recommend a dedicated type yes. I think you may technically not absolutely need it because the MemorySlot would model it already, but I think it would be better to make sure your slots do not accidentally escape like in our case.

Load/store operations on the other hand are absolutely necessary.

I’ll give a proper review to your code later when I have time if you want.

hailongSun2000 · 2024-05-24T09:55:19Z

Thanks! I need to reimplement the related method. In order to prevent the accident slot escape, we can add a wrapper for the type. What do @fabianschuiki think?

hailongSun2000 · 2024-05-27T08:27:07Z

The deleteLocalVar pass and a lot of redundant code is removed. And Mem2Reg only works when enabling --ir-moore/--ir-hw, so I haven't added the related tests for the time being. And I'm not sure calling erase() is whether reasonable. If it's ok, I'll tweak the moore.read_lvalue into moore.read_value and its description.

hailongSun2000 · 2024-05-27T08:31:31Z

lib/Conversion/ImportVerilog/Expressions.cpp

+    auto *readOp = context.convertExpression(expr.left()).getDefiningOp();
+    auto lhs = dyn_cast<moore::ReadLValueOp>(readOp).getOperand();
+    readOp->erase();
+


There is no way to estimate when to create moore.read_lvalue. So call op->erase() to remove it if it is created by expr.left().

I think this is a good idea!

I have a feeling that we are going to introduce a dedicated expression lowering for left-hand sides of assignments. This is done in a similar fashion in C/C++:

int x, y; x = y

would lower to something like the following pseudo-code:

%x = alloca : ptr<int> %y = alloca : ptr<int> %0 = load %y // y lowered as rvalue store %x, %0 // x lowered as lvalue

If arrays or structs are involved, you'd have dedicated ops that allow you to convert a pointer/reference to the struct into a reference to the field you're interested in:

struct { int a, b; } x, y; x.b = y.a;

would lower to something like this:

%x = alloca : ptr<struct<a: int, b: int>> %y = alloca : ptr<struct<a: int, b: int>> // x.b lowered as lvalue %0 = struct_field_ref %x, "b" : ptr<struct<a: int, b: int>> -> ptr<int> // y.a lowered as rvalue %1 = load %y : ptr<struct<a: int, b: int>> -> struct<a: int, b: int> %2 = struct_field %x, "a" : struct<a: int, b: int> -> int // assignment store %0, %2 : ptr<int>, int

These ops would allow us to reason about assignable lvalues that are just references to a storage location, and rvalues that are actual values. We would probably create two separate expression lowering functions, context.convertLvalueExpression and context.convertRvalueExpression: the former would always return a ref<T> type, and the latter would always return an actual value T.

fabianschuiki · 2024-05-27T17:15:34Z

In order to prevent the accident slot escape, we can add a wrapper for the type. What do @fabianschuiki think?

I think that's a fantastic idea! All types are now in ODS, so it should be pretty easy to create these simple wrapper types. We could start with a ref<T> type, and later think about whether we'd want to distinguish between nets and variables if that simplifies some of the lowering passes.

hailongSun2000 · 2024-05-28T07:18:04Z

I'm trying to add a new type named RefType as a wrapper type. This reminds me of MooreLValueType, MooreRValueType, and VariableDeclOp had been removed.

fabianschuiki · 2024-05-28T22:29:48Z

This reminds me of MooreLValueType, MooreRValueType, and VariableDeclOp

Yeah absolutely… I think we anticipated that we would need an lvalue/rvalue split at some point, but it got in the way of the initial lowering work back then. Now that we know exactly what we need, we're in a much better position to add a ref type back in that handles this 😃

hailongSun2000 · 2024-06-05T10:21:53Z

In my private branch, the Mem2Reg pass can work. But this PR relies on #7095. Because we add two options(--ir-moore and --ir-hw) for circt-verilog. Only enable ir-moore, the Mem2Reg can work. So I want to know how do I add the related test case? Thanks in advance! 😄

fabianschuiki · 2024-06-07T02:42:11Z

In my private branch, the Mem2Reg pass can work.

That is great to hear 😍! Maybe once #7095 lands, you can rebase this PR onto that work and get everything up and running?

hailongSun2000 · 2024-06-07T02:45:21Z

Yeah, right! I'll just turn this draft into the PR later!

hailongSun2000 · 2024-06-07T03:52:28Z

Hey, @fabianschuiki. I added a new test file(optimization.sv) to check whether Mem2Reg can work.

hailongSun2000 · 2024-06-07T03:56:27Z

I would recommend a dedicated type yes. I think you may technically not absolutely need it because the MemorySlot would model it already, but I think it would be better to make sure your slots do not accidentally escape like in our case.

Load/store operations on the other hand are absolutely necessary.

I’ll give a proper review to your code later when I have time if you want.

Hey, @Moxinilian. We designed a dedicated (reference)type wrapper to ensure slots don't accidentally escape. You can rest assured. Thanks again for your help! 😃

mingzheTerapines · 2024-06-07T05:45:06Z

test/Conversion/ImportVerilog/optimization.sv

THis test contains import verilog and mem2reg 2passes. Personally think you just need to check only mem2reg pass.
Here is a HW test which just test canonicalize.

// RUN: circt-opt -canonicalize='top-down=true region-simplify=true' %s | FileCheck %s // CHECK-LABEL: hw.module @extract_noop(in %arg0 : i3, out "" : i3) { // CHECK-NEXT: hw.output %arg0 hw.module @extract_noop(in %arg0 : i3, out "": i3) { %x = comb.extract %arg0 from 0 : (i3) -> i3 hw.output %x : i3 } // Constant Folding // CHECK-LABEL: hw.module @extract_cstfold(out result : i3) { // CHECK-NEXT: %c-3_i3 = hw.constant -3 : i3 // CHECK-NEXT: hw.output %c-3_i3 hw.module @extract_cstfold(out result : i3) { %c42_i12 = hw.constant 42 : i12 %x = comb.extract %c42_i12 from 3 : (i12) -> i3 hw.output %x : i3 }

I agree with @mingzheTerapines 🙂! Since you are specifically checking whether mem2reg works on the Moore dialect, I would add a mem2reg.mlir test instead, use something like // RUN: circt-opt --mem2reg %s | FileCheck %s and use Moore dialect ops instead of SV inputs there.

The reasoning behind this is that you want a test that only checks the Verilog-to-Moore conversion, and a test that only checks whether mem2reg works on the Moore dialect. But you don't want to do both in the same file, since it makes tests very brittle and easy to break.

Hey, @mingzheTerapines @fabianschuiki. I added the mem2reg to circt-opt, which only verifies whether the mem2reg works.

This test and tool option looks perfect!

Moxinilian

Looks good to me beyond the newline nits and the test.

lib/Dialect/Moore/MooreOps.cpp

tools/circt-verilog/circt-verilog.cpp

hailongSun2000 · 2024-06-07T12:19:25Z

Oh! Thanks! I'll be careful later.

fabianschuiki

Very nice! Great to see mem2reg in action 😎

tools/circt-verilog/circt-verilog.cpp

[Moore] Introduce Mem2Reg to elminate local variables. Co-authored-by: Fabian Schuiki <[email protected]>

hailongSun2000 added the Moore label May 23, 2024

hailongSun2000 requested a review from fabianschuiki May 23, 2024 07:02

hailongSun2000 marked this pull request as draft May 23, 2024 07:13

hailongSun2000 requested review from maerhart and dtzSiFive May 23, 2024 08:31

hailongSun2000 mentioned this pull request May 27, 2024

[Moore] Move struct types into ODS #7091

Merged

hailongSun2000 force-pushed the dev/apply-mem2reg branch from f6683a1 to e4987b6 Compare May 27, 2024 08:18

hailongSun2000 commented May 27, 2024

View reviewed changes

hailongSun2000 force-pushed the dev/apply-mem2reg branch from e4987b6 to 7e145d6 Compare June 7, 2024 03:47

hailongSun2000 marked this pull request as ready for review June 7, 2024 03:47

hailongSun2000 changed the title ~~Dev/apply mem2reg~~ [Moore] Introduce Mem2Reg to eliminate local variables. Jun 7, 2024

mingzheTerapines reviewed Jun 7, 2024

View reviewed changes

Moxinilian reviewed Jun 7, 2024

View reviewed changes

lib/Dialect/Moore/MooreOps.cpp Outdated Show resolved Hide resolved

tools/circt-verilog/circt-verilog.cpp Outdated Show resolved Hide resolved

tools/circt-verilog/circt-verilog.cpp Outdated Show resolved Hide resolved

fabianschuiki changed the title ~~[Moore] Introduce Mem2Reg to eliminate local variables.~~ [Moore] Introduce Mem2Reg to eliminate local variables Jun 7, 2024

hailongSun2000 force-pushed the dev/apply-mem2reg branch 3 times, most recently from 29ea7e7 to 59b4814 Compare June 11, 2024 03:33

fabianschuiki approved these changes Jun 12, 2024

View reviewed changes

tools/circt-verilog/circt-verilog.cpp Outdated Show resolved Hide resolved

tools/circt-verilog/circt-verilog.cpp Outdated Show resolved Hide resolved

[Moore] A new pass to delete local temporary variables.

57c71fd

[Moore] Introduce Mem2Reg to elminate local variables. Co-authored-by: Fabian Schuiki <[email protected]>

hailongSun2000 force-pushed the dev/apply-mem2reg branch from 59b4814 to 57c71fd Compare June 13, 2024 02:10

hailongSun2000 merged commit 190c5f8 into llvm:main Jun 13, 2024
4 checks passed

hailongSun2000 deleted the dev/apply-mem2reg branch July 4, 2024 06:27

hailongSun2000 mentioned this pull request Jul 26, 2024

[Moore] [Canonicalizer] Lower struct-related assignOp #7341

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Moore] Introduce Mem2Reg to eliminate local variables #7082

[Moore] Introduce Mem2Reg to eliminate local variables #7082

hailongSun2000 commented May 23, 2024

hailongSun2000 commented May 23, 2024 •

edited

Loading

fabianschuiki commented May 23, 2024 •

edited

Loading

hailongSun2000 commented May 24, 2024 •

edited

Loading

hailongSun2000 commented May 24, 2024

Moxinilian commented May 24, 2024

hailongSun2000 commented May 24, 2024

hailongSun2000 commented May 24, 2024 •

edited

Loading

Moxinilian commented May 24, 2024 •

edited

Loading

hailongSun2000 commented May 24, 2024

hailongSun2000 commented May 27, 2024 •

edited

Loading

hailongSun2000 May 27, 2024

fabianschuiki May 27, 2024

fabianschuiki commented May 27, 2024

hailongSun2000 commented May 28, 2024

fabianschuiki commented May 28, 2024

hailongSun2000 commented Jun 5, 2024

fabianschuiki commented Jun 7, 2024

hailongSun2000 commented Jun 7, 2024

hailongSun2000 commented Jun 7, 2024

hailongSun2000 commented Jun 7, 2024 •

edited

Loading

mingzheTerapines Jun 7, 2024

fabianschuiki Jun 7, 2024

hailongSun2000 Jun 11, 2024

mingzheTerapines Jun 11, 2024

Moxinilian left a comment

hailongSun2000 commented Jun 7, 2024

fabianschuiki left a comment

[Moore] Introduce Mem2Reg to eliminate local variables #7082

[Moore] Introduce Mem2Reg to eliminate local variables #7082

Conversation

hailongSun2000 commented May 23, 2024

Please just view the MooreOps.cpp file to check the related implementation.

First

Second

hailongSun2000 commented May 23, 2024 • edited Loading

fabianschuiki commented May 23, 2024 • edited Loading

hailongSun2000 commented May 24, 2024 • edited Loading

hailongSun2000 commented May 24, 2024

Moxinilian commented May 24, 2024

hailongSun2000 commented May 24, 2024

hailongSun2000 commented May 24, 2024 • edited Loading

Moxinilian commented May 24, 2024 • edited Loading

hailongSun2000 commented May 24, 2024

hailongSun2000 commented May 27, 2024 • edited Loading

hailongSun2000 May 27, 2024

Choose a reason for hiding this comment

fabianschuiki May 27, 2024

Choose a reason for hiding this comment

fabianschuiki commented May 27, 2024

hailongSun2000 commented May 28, 2024

fabianschuiki commented May 28, 2024

hailongSun2000 commented Jun 5, 2024

fabianschuiki commented Jun 7, 2024

hailongSun2000 commented Jun 7, 2024

hailongSun2000 commented Jun 7, 2024

hailongSun2000 commented Jun 7, 2024 • edited Loading

mingzheTerapines Jun 7, 2024

Choose a reason for hiding this comment

fabianschuiki Jun 7, 2024

Choose a reason for hiding this comment

hailongSun2000 Jun 11, 2024

Choose a reason for hiding this comment

mingzheTerapines Jun 11, 2024

Choose a reason for hiding this comment

Moxinilian left a comment

Choose a reason for hiding this comment

hailongSun2000 commented Jun 7, 2024

fabianschuiki left a comment

Choose a reason for hiding this comment

Please just view the `MooreOps.cpp` file to check the related implementation.

hailongSun2000 commented May 23, 2024 •

edited

Loading

fabianschuiki commented May 23, 2024 •

edited

Loading

hailongSun2000 commented May 24, 2024 •

edited

Loading

hailongSun2000 commented May 24, 2024 •

edited

Loading

Moxinilian commented May 24, 2024 •

edited

Loading

hailongSun2000 commented May 27, 2024 •

edited

Loading

hailongSun2000 commented Jun 7, 2024 •

edited

Loading