-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCC: Loop over blocks not annotated correctly when nested within another loop in driver #246
Comments
I can confirm the behaviour and you are on the right path w.r.t. where this comes from: currently, all loops that contain a call to a I'm not sure if we want/should prescribe some kind of rule about the nesting of driver loops. In any case, here is a test based on your reproduces, which should help when investigating this in the future: @pytest.mark.parametrize('frontend', available_frontends())
def test_scc_annotate_driver_loop(frontend):
"""
Test for issue #246
"""
fcode = """
module mod
implicit none
integer, parameter :: ngpblk = 2
integer, parameter :: klon = 10
integer, parameter :: klev = 5
contains
subroutine wrapper(arr)
integer, parameter :: niter = 10
integer, intent(in) :: arr(klon, klev, ngpblk)
integer :: jblk, jiter
! Begin relevant part.
DO jiter=1,niter
DO jblk=1,ngpblk
CALL kernel(klon, klev, 1, 1, arr(:, :, jblk))
END DO
END DO
! End relevant part.
end subroutine wrapper
subroutine kernel(klon, klev, jlev_lower, jlon_lower, arr)
integer, intent(in) :: klon
integer, intent(in) :: klev
integer, intent(in) :: jlev_lower, jlon_lower
integer, intent(inout) :: arr(klon, klev)
integer :: jlon, jlev
DO jlev=jlev_lower,klev
DO jlon=jlon_lower,klon
arr(jlon, jlev) = jlon + jlev
END DO
END DO
end subroutine kernel
end module mod
""".strip()
workdir = gettempdir() / 'test_scc_annotate_driver_loop'
workdir.mkdir(exist_ok=True)
source = Sourcefile.from_source(fcode, frontend=frontend, xmods=(workdir,))
horizontal = Dimension('horizontal', index='JLON', bounds=['jlon_lower', 'klon'], size='KLON')
vertical = Dimension('vertical', index='JLEV', bounds=['jlev_lower', 'klev'], size='KLEV')
block_dim = Dimension('block_dim', index='JBLK', size='NGPBLK')
# Test OpenACC annotations on non-hoisted version
scc_transform = (
SCCDevectorTransformation(horizontal=horizontal),
SCCDemoteTransformation(horizontal=horizontal),
SCCRevectorTransformation(horizontal=horizontal),
SCCAnnotateTransformation(
horizontal=horizontal, vertical=vertical, directive='openacc', block_dim=block_dim
),
)
for transform in scc_transform:
transform.apply(source['wrapper'], role='driver', targets=['kernel'])
transform.apply(source['kernel'], role='kernel')
# Ensure routine is annotated at vector level
pragmas = FindNodes(Pragma).visit(source['kernel'].ir)
assert len(pragmas) == 5
assert pragmas[0].keyword == 'acc'
assert pragmas[0].content == 'routine vector'
assert pragmas[1].keyword == 'acc'
assert pragmas[1].content == 'data present(arr)'
assert pragmas[-1].keyword == 'acc'
assert pragmas[-1].content == 'end data'
# Ensure vector and seq loops are annotated, including privatized variable `b`
with pragmas_attached(source['kernel'], Loop):
kernel_loops = FindNodes(Loop).visit(source['kernel'].ir)
assert len(kernel_loops) == 2
assert kernel_loops[0].pragma[0].keyword == 'acc'
assert kernel_loops[0].pragma[0].content == 'loop vector'
assert kernel_loops[1].pragma[0].keyword == 'acc'
assert kernel_loops[1].pragma[0].content == 'loop seq'
# Ensure a single outer parallel loop in driver
with pragmas_attached(source['wrapper'], Loop):
driver_loops = FindNodes(Loop).visit(source['wrapper'].body)
assert len(driver_loops) == 2
assert not driver_loops[0].pragma
assert driver_loops[1].pragma[0].keyword == 'acc'
assert driver_loops[1].pragma[0].content == 'parallel loop gang vector_length(KLON)'
rmtree(workdir) |
Hello, filing an issue by request of @reuterbal.
Here is a reproducer of the problem:
After transforming with SCC, the relevant part becomes:
Although I believe this should be
instead.
I invoke the SCC transform with:
loki-transform.py convert --mode scc --config config.cfg --path src --out-path scc
,where the contents of "config.cfg" are:
I believe the issue is coming from
SCCAnnotate.process_driver
. I did some digging and I found out that the method callsSCCBaseTransformation.find_driver_loops
, which returns the outer loop in the above example.Should this function return the inner loop instead? (as the loops get passed to
annotate_driver
)?The text was updated successfully, but these errors were encountered: