Skip to content

Conversation

@TopiLeppanen
Copy link

Hello, I attempted to resolve the #1729, as I have been compiling some simple OpenCL kernels similar to @mahmood82.

According to the @felixdaas's suggestion of using the 'polygeist way', I have changed the default type conversion of cir.ptr to memref<?xType>.

To make the intermediate legalization work, this also changes the lowering of AllocaOp and GlobalOp to generate memref<1xType> followed by a CastOp to memref<?xType>. CastOp is later eliminated in Load/Store-lowering. Load/Store-lowering's findBaseAndIndices was then adjusted to generate op with zero indices in case a non-indexed op is performed.

I added a testcase ptr-arg.cir, with example code similar to one in #1729

I also had to touch the loop lowering to SCF, since the alloca-change affected the 'canonical-loop recognition'.

I also had to change all the ThroughMLIR tests to work with the alloca of memref<1xType>.

Additionally, the ptr.dialect based lowering by @terapines-osc-cir is now basically dead code, as the ptrstride-ptr.cir-test they added is now lowered directly to memref-ops. So probably a new, more complex testcase would be needed to also test it, and to change the ptrstrideop-lowering to still take that path in the more complex cases.

This changes the default type conversion of cir.ptr to memref<?xType>

To make the intermediate legalization work, this also changes the
lowering of AllocaOp and GlobalOp to generate memref<1xType> followed by
a CastOp to memref<?xType>. CastOp is later eliminated in
Load/Store-lowering. Load/Store-lowering's findBaseAndIndices was then
adjusted to generate op with zero indices in case a non-indexed op is
performed.

Add a testcase ptr-arg.cir, with example code similar to one in llvm#1729

I also had to touch the loop lowering to SCF, since the alloca-change
affected the 'canonical-loop recognition'.

Fixes llvm#1729
@TopiLeppanen
Copy link
Author

CastOp is later eliminated in Load/Store-lowering.

Forgot to mention, but this was an optional step I decided to add in order to generate slightly less code, but we could just as well just leave the CastOp there, to simplify the implementation. I noticed that canonicalization is able to remove it as well. Leaving it in there would probably also reduce the changes I had to make in SCF-loop lowering as there's no longer this issue of referencing an op that has already been bypassed. As a tradeoff, all the tests that use alloca would then need to be fixed to have this memref.cast-op after the alloca, but that wouldn't be too difficult, just a bit noisy.

Let me know if that would make more sense, and I can do it pretty easily.

@bcardosolopes
Copy link
Member

@TopiLeppanen thanks for your first CIR contribution! I like the plan, the 'polygeist way' makes us closer to things we already know to work, way to go. Before we land this I just want to make sure that @terapines-osc-cir is happy with the path, can you give us a stamp?

@terapines-osc-cir
Copy link
Contributor

Before we land this I just want to make sure that @terapines-osc-cir is happy with the path

Cool! Like it 👍

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once conflicts are fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants