-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Textures being created with incorrect sizeInBytes #210
Comments
I'll try to find some time this week to look into this. |
I spent some time looking into this and I think I know what is happening for this particular case. I think the underlying problem is how Bifrost computes array sizes. By default it uses the product of the first array stride length with the size of the first array dimension to get the overall array size. In Python that would be:
which looks ok. When
which doesn't match. The
In this case the correct array size would come from something like I'm less sure what to do about it. In this particular case changing the calculation of Footnotes
|
Thank you for looking at this @jaycedowell . I'll re-read your comment a few times and attempt to have another look at it next week. |
Hi @jaycedowell I think the following gives the correct value for size_t A_nelement_total =
(A_offset + N) // the elements in the first row of first batch
+ (K - 1) * A_stride // the elements in the rest of the first batch
+ (nbatch - 1) * A_batchstride; // the elements for the remaining batches The tests pass and I no longer get memory violations in ROCM. Do you think the logic works now for all use cases? |
I'm not sure that |
I think it needs to be sized to store all the batches. The texture isn't created once per batch, it's shared amongst all of them, from my reading of the code. In any case, if I delete the last term in the sum the tests will fail again. |
The
test_linalg/test_matmul_ab_beamformer_kernel_small
test adds amisalign
parameter to testlinalg_kernels.cu
handling of memory not aligned to 512 byte boundaries:bifrost/test/test_linalg.py
Lines 224 to 231 in 30fd268
For example, on the third innermost iteration, this tests passes in the values
(nchan = 1, ntime = 1, nstand = 3, misalign = 2)
. This produces an array with size(1, 1, 6)
, which is then truncated (using the misalign value) down to a(1, 1, 4)
.After calling
linalg.matmul
, this eventually reaches the functionbf_cherk_N()
where the total elements in the array needs to be calculated:bifrost/src/linalg_kernels.cu
Lines 539 to 540 in 30fd268
For this particular test, the
n_element_total
should be2 (size of truncated array) + 1 (offset)
, however it calculates it as3 + 1 = 4
.When this incorrect, larger size is later used to calculate the
sizeInBytes
of the texture, this results in the texture believe it has access to more memory than it really does. On CUDA backends this isn't actually validated, but on ROCM backends, the texture creation fails since it actually checks the backing memory.I've spent some time with this but haven't been able to fix it, as I'm not clear what all of N, K, A_stride, nbatch, etc. are, or where they are meant to be coming from.
The text was updated successfully, but these errors were encountered: