-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[test] Reduce for algorithm testing with debug builds to prevent timeouts #2058
Conversation
Signed-off-by: Matthew Michel <[email protected]>
Signed-off-by: Matthew Michel <[email protected]>
This reverts commit 75842f1.
Are we losing coverage on full kernels by shrinking this space or are we still at least hitting all the same kernels on debug? |
On integrated graphics, we will be hitting all of the kernels in debug mode which make up most of our CI testing. However, on PVC / BMG we will not. I had experimented with a cap of 2 million which would ensure this for PVC, but I saw very long times with debug builds on battlemage (500 - 600 seconds per test). |
Moving to draft while I test a new strategy on the following branch that addresses the point @danhoeflinger brought up: https://github.com/uxlfoundation/oneDPL/compare/dev/mmichel11/pfor_debug_input_size_reduce_tmp |
Signed-off-by: Matthew Michel <[email protected]>
Signed-off-by: Matthew Michel <[email protected]>
@danhoeflinger This issue has been addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good enough to me leaving aside some minor comments.
Signed-off-by: Matthew Michel <[email protected]>
Signed-off-by: Matthew Michel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a good strategy with some flexibility going forward, allowing us to keep general coverage on debug builds. LGTM
Description
The input sizes used for testing for-based algorithms is dependent on the number of execution units on the SYCL queue. This approach was used to ensure we generate large enough inputs to test the large for submitter path in oneDPL. A hard cap is set at 10,000,000 elements which is sufficient to test the large submitter path on high-performance GPUs with many EUs.
However, we recently spotted that this approach results in test timeouts on client GPUs such as BMG and Arc in debug mode. To fix this issue, a new approach has been added for debug mode to avoid such timeouts while ensuring a single test case for the large submitter path. The "normal" set of inputs is used (factors of PI up to 100k elements) followed by a single test case above the threshold to invoke the large submitter. For release builds, testing remains unchanged and is more extensive. Test times on BMG with debug builds now in several hundred seconds (as opposed to timing out at 1200) and CI is able to complete with several hours before the timeout threshold.
Details
The previous utility
get_pattern_for_max_n
utility which returned a single maximum input is replaced withget_pattern_for_test_sizes
which returns a monotonically increasing sequence of elements to test with. Each for-based test is touched to accommodate these changes.