Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance test and document for DPI and tweak trait name. #4380

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

uenoku
Copy link
Contributor

@uenoku uenoku commented Sep 4, 2024

This PR adds a test for open array and document docs/src/explanations/dpi.md which describes DPI API usage.
This also changes DPIClockedVoidFunctionImport to DPIVoidFunctionImport for more consistent naming (this is breaking change but I believe there is no user of DPIClockedVoidFunctionImport so it should be ok).

Contributor Checklist

  • Did you add Scaladoc to every public function/method?
  • Did you add at least one test demonstrating the PR?
  • Did you delete any extraneous printlns/debugging code?
  • Did you specify the type of improvement?
  • Did you add appropriate documentation in docs/src?
  • Did you request a desired merge strategy?
  • Did you add text to be included in the Release Notes for this change?

Type of Improvement

  • API modification
  • Documentation or website-related

Desired Merge Strategy

  • Squash: The PR will be squashed and merged (choose this if you have no preference).

Release Notes

Enhance test and document for DPI and tweak trait name

Reviewer Checklist (only modified by reviewer)

  • Did you add the appropriate labels? (Select the most appropriate one based on the "Type of Improvement")
  • Did you mark the proper milestone (Bug fix: 3.6.x, 5.x, or 6.x depending on impact, API modification or big change: 7.0)?
  • Did you review?
  • Did you check whether all relevant Contributor checkboxes have been checked?
  • Did you do one of the following when ready to merge:
    • Squash: You/ the contributor Enable auto-merge (squash), clean up the commit message, and label with Please Merge.
    • Merge: Ensure that contributor has cleaned up their commit history, then merge with Create a merge commit.

@uenoku uenoku added the Documentation Only changing documentation label Sep 4, 2024
final def apply() = super.call()
}

Hello() // Print
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this Hello() call be inside a Module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The snippet was meant to be pseudo code so I'll update with mdoc.


To call this function from Chisel, we need to define a corresponding DPI object.

```scala
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use mdoc so that we can check the code compiles at least?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for pointing out. Totally missed there is mdoc support in chisel repo as well.

Comment on lines 47 to 48
* A vector is lowered to an *unpacked* *open* array type, e.g., `a: Vec<4, UInt>` to `byte a []`.
* A bundle is lowered to a packed struct.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A vector is lowered to an *unpacked* *open* array type, e.g., `a: Vec<4, UInt>` to `byte a []`.
* A bundle is lowered to a packed struct.
* A `Vec` is lowered to an *unpacked* *open* array type, e.g., `a: Vec<4, UInt>` to `byte a []`.
* A `Bundle` is lowered to a packed struct.

for vec i think you mean a chisel3.Vec not a scala Vector, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, chisel3.Vec.

* Operand and result types must be passive.
* A vector is lowered to an *unpacked* *open* array type, e.g., `a: Vec<4, UInt>` to `byte a []`.
* A bundle is lowered to a packed struct.
* Integer types are lowered into 2-state types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Integer types are lowered into 2-state types.
* `Int`, `SInt`, `Clock`, `Reset` types are lowered into 2-state types.

? Are clock and reset lowered?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not actually sure if you are talkign about Chisel Data subclasses or not

final def apply(lhs: UInt, rhs: UInt): UInt = super.call(lhs, rhs)
}

val result = Add(2.U(32.W), 3.U(32.W))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again would be helpful to see this with mdoc, I assume this Add function belongs inside a Module or a RawModule given that we say clocked is False


* `Add` inherits from `DPINonVoidFunctionImport[UInt]` because it returns a 32-bit unsigned integer.
* `ret` specifies the return type.
* `clocked` indicates that this is a clocked function call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this is set to false so isn't NOT a clocked function call?

object Sum extends DPINonVoidFunctionImport[UInt] {
override val functionName = "sum"
override val ret = UInt(32.W)
override val clocked = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should clocked be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For OpenArray example both clocked and unclocked work. So I didn't explicitly describe the example. It depends on whether user wants to evaluate the value at every input value changes or clock's posedge.
This is an example for Add

dpi.io.add_clocked_result.peek()
dpi.io.add_clocked_result.expect(60)
dpi.io.add_unclocked_result.peek()
dpi.io.add_unclocked_result.expect(36)

}
```

```scala
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again please use mdoc to check for compile if not chisel elaboration

```

# FAQ

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you show an example where having clocked = false makes sense? Is that done in a when context...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclocked calls be useful when user wants to replace pure (but expensive) combinatorial logics with dpi call (actually Add is a good example of such use case).
An unclocked call under when(cond) is lowered into always_comb + if(cond) and DPI is conditionally invoked as well. I updated Add example to include both of them.

class AddTest extends Module {
  val io = IO(new Bundle {
    val a = Input(UInt(32.W))
    val b = Input(UInt(32.W))
    val c = Output(UInt(32.W))
    val d = Output(UInt(32.W))
    val en = Input(Bool())
  })

  // Call DPI only when `en` is true.
  when (io.en) {
    io.c := AddClocked(io.a, io.b)
    io.d := AddUnclocked(io.a, io.b)
  } .otherwise {
    io.c := 0.U(32.W)
    io.d := 0.U(32.W)
  }
}
module AddTest(
  input         clock,
                reset,
  input  [31:0] io_a,
                io_b,
  output [31:0] io_c,
                io_d,
  input         io_en
);

  logic [31:0] _add_0;
  reg   [31:0] _GEN;
  always @(posedge clock) begin
    if (io_en) begin
      add(io_a, io_b, _add_0);
      _GEN <= _add_0;
    end
  end // always @(posedge)
  reg   [31:0] _GEN_0;
  always_comb begin
    if (io_en) begin
      add(io_a, io_b, _GEN_0);
    end
    else
      _GEN_0 = 32'bx;
  end // always_comb
  assign io_c = io_en ? _GEN : 32'h0;
  assign io_d = io_en ? _GEN_0 : 32'h0;

@mmaloney-sf
Copy link
Contributor

I'm not 100% sure of Chisel's documentation conventions, but quickly glancing at other .md files, I think we should stick to a "one sentence per line" convention.

This makes diffs easier to read.

docs/src/explanations/dpi.md Outdated Show resolved Hide resolved
docs/src/explanations/dpi.md Outdated Show resolved Hide resolved
docs/src/explanations/dpi.md Outdated Show resolved Hide resolved
docs/src/explanations/dpi.md Outdated Show resolved Hide resolved

## Type ABI

Unlike normal Chisel compilation flow, we use a specific ABI for types to interact with DPI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this means. Could you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also be useful to explicitly point out that what we're doing in this section is showing how Chisel types map to C types.

Here's an example of a DPI function that calculates the sum of two numbers:

```c++
extern "C" void add(int lhs, int rhs, int* result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It surprised me to see here that the int* result is an out-parameter. Is that always the case? (If so, maybe it needs to be called out earlier. Or maybe I just missed it?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I thought I clarified that but apparently it was dropped somewhere. I'll add explanation for that.

Chisel vectors are converted into SystemVerilog open arrays when used with DPI. Since memory layout can vary between simulators, it's recommended to use `svSize` and `svGetBitArrElemVecVal` to access array elements.

```c++
extern "C" void sum(const svOpenArrayHandle array, int* result) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is svOpenArrayHandle?

I guess, more generally, what libraries does Chisel depend on in the C++ DPI? We can't simply assume that a general Chisel user will have whatever library exposes DPI.h or whatever it is.

It would be good to have a section at the very start that gives the C++ prerequisites.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, that makese sense. SV Spec (Section 35) defineds svdpi.h which declares these functions and types (including svOpenArrayHandle). These are expected to be implemented by simulators. I'll create a section for that.


# FAQ

* Can we export functions? -- No, not currently. Consider using a black box for such functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "export" here?


* Can we export functions? -- No, not currently. Consider using a black box for such functionality.
* Can we call a DPI function in initial block? -- No, not currently. Consider using a black box for initialization.
* Can we call two clocked DPI calls and pass the result to another within the same clock? -- No, not currently. Please merge the DPI functions into a single function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this last point means.

It feels like what we actually want is a light explanation of the underlying "framework" for the DPI calls, and that the above comment is actually a restriction imposed on this framework.

For example, is this true?: "All DPI calls on the same clock are dispatched at the same time. Because of this, it is not possible to use the result of one DPI call as an argument to another."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All DPI calls on the same clock are dispatched at the same time. Because of this, it is not possible to use the result of one DPI call as an argument to another

Yes, that's correct. This is fundamental restrictions for side-effecting operations in Chisel but DPI is a first example that has both side-effect and results.

@@ -17,6 +17,7 @@ private object EmitDPIImplementation {
val dpiImpl = s"""
|#include <stdint.h>
|#include <iostream>
|#include <svdpi.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reiterate an earlier point: Let's add a prerequisites section near the top. Let's make sure to name what library(ies?) are needed for Chisel DPI, and let's be sure to include the svdpi.h header in our examples, since otherwise, the code doesn't work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also link to the documentation so that Chisel users can quickly cross-reference the DPI primitives, such as svBitVecVal or svOpenArrayHandle.

Copy link
Contributor Author

@uenoku uenoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for comments! I added code examples with mdoc. I'll mark it draft until all comments were addressed.

Comment on lines 47 to 48
* A vector is lowered to an *unpacked* *open* array type, e.g., `a: Vec<4, UInt>` to `byte a []`.
* A bundle is lowered to a packed struct.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, chisel3.Vec.

Here's an example of a DPI function that calculates the sum of two numbers:

```c++
extern "C" void add(int lhs, int rhs, int* result)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I thought I clarified that but apparently it was dropped somewhere. I'll add explanation for that.

Chisel vectors are converted into SystemVerilog open arrays when used with DPI. Since memory layout can vary between simulators, it's recommended to use `svSize` and `svGetBitArrElemVecVal` to access array elements.

```c++
extern "C" void sum(const svOpenArrayHandle array, int* result) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, that makese sense. SV Spec (Section 35) defineds svdpi.h which declares these functions and types (including svOpenArrayHandle). These are expected to be implemented by simulators. I'll create a section for that.

object Sum extends DPINonVoidFunctionImport[UInt] {
override val functionName = "sum"
override val ret = UInt(32.W)
override val clocked = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For OpenArray example both clocked and unclocked work. So I didn't explicitly describe the example. It depends on whether user wants to evaluate the value at every input value changes or clock's posedge.
This is an example for Add

dpi.io.add_clocked_result.peek()
dpi.io.add_clocked_result.expect(60)
dpi.io.add_unclocked_result.peek()
dpi.io.add_unclocked_result.expect(36)


To call this function from Chisel, we need to define a corresponding DPI object.

```scala
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for pointing out. Totally missed there is mdoc support in chisel repo as well.

final def apply() = super.call()
}

Hello() // Print
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The snippet was meant to be pseudo code so I'll update with mdoc.


* Can we export functions? -- No, not currently. Consider using a black box for such functionality.
* Can we call a DPI function in initial block? -- No, not currently. Consider using a black box for initialization.
* Can we call two clocked DPI calls and pass the result to another within the same clock? -- No, not currently. Please merge the DPI functions into a single function.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All DPI calls on the same clock are dispatched at the same time. Because of this, it is not possible to use the result of one DPI call as an argument to another

Yes, that's correct. This is fundamental restrictions for side-effecting operations in Chisel but DPI is a first example that has both side-effect and results.

```

# FAQ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclocked calls be useful when user wants to replace pure (but expensive) combinatorial logics with dpi call (actually Add is a good example of such use case).
An unclocked call under when(cond) is lowered into always_comb + if(cond) and DPI is conditionally invoked as well. I updated Add example to include both of them.

class AddTest extends Module {
  val io = IO(new Bundle {
    val a = Input(UInt(32.W))
    val b = Input(UInt(32.W))
    val c = Output(UInt(32.W))
    val d = Output(UInt(32.W))
    val en = Input(Bool())
  })

  // Call DPI only when `en` is true.
  when (io.en) {
    io.c := AddClocked(io.a, io.b)
    io.d := AddUnclocked(io.a, io.b)
  } .otherwise {
    io.c := 0.U(32.W)
    io.d := 0.U(32.W)
  }
}
module AddTest(
  input         clock,
                reset,
  input  [31:0] io_a,
                io_b,
  output [31:0] io_c,
                io_d,
  input         io_en
);

  logic [31:0] _add_0;
  reg   [31:0] _GEN;
  always @(posedge clock) begin
    if (io_en) begin
      add(io_a, io_b, _add_0);
      _GEN <= _add_0;
    end
  end // always @(posedge)
  reg   [31:0] _GEN_0;
  always_comb begin
    if (io_en) begin
      add(io_a, io_b, _GEN_0);
    end
    else
      _GEN_0 = 32'bx;
  end // always_comb
  assign io_c = io_en ? _GEN : 32'h0;
  assign io_d = io_en ? _GEN_0 : 32'h0;

@uenoku uenoku marked this pull request as draft September 5, 2024 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Only changing documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants