[test] Reduce for algorithm testing with debug builds to prevent timeouts #2058

mmichel11 · 2025-02-10T21:40:24Z

Description

The input sizes used for testing for-based algorithms is dependent on the number of execution units on the SYCL queue. This approach was used to ensure we generate large enough inputs to test the large for submitter path in oneDPL. A hard cap is set at 10,000,000 elements which is sufficient to test the large submitter path on high-performance GPUs with many EUs.

However, we recently spotted that this approach results in test timeouts on client GPUs such as BMG and Arc in debug mode. To fix this issue, a new approach has been added for debug mode to avoid such timeouts while ensuring a single test case for the large submitter path. The "normal" set of inputs is used (factors of PI up to 100k elements) followed by a single test case above the threshold to invoke the large submitter. For release builds, testing remains unchanged and is more extensive. Test times on BMG with debug builds now in several hundred seconds (as opposed to timing out at 1200) and CI is able to complete with several hours before the timeout threshold.

Details

The previous utility get_pattern_for_max_n utility which returned a single maximum input is replaced with get_pattern_for_test_sizes which returns a monotonically increasing sequence of elements to test with. Each for-based test is touched to accommodate these changes.

Signed-off-by: Matthew Michel <[email protected]>

This reverts commit 75842f1.

danhoeflinger · 2025-02-10T21:59:06Z

Are we losing coverage on full kernels by shrinking this space or are we still at least hitting all the same kernels on debug?

mmichel11 · 2025-02-10T22:09:18Z

Are we losing coverage on full kernels by shrinking this space or are we still at least hitting all the same kernels on debug?

On integrated graphics, we will be hitting all of the kernels in debug mode which make up most of our CI testing. However, on PVC / BMG we will not. I had experimented with a cap of 2 million which would ensure this for PVC, but I saw very long times with debug builds on battlemage (500 - 600 seconds per test).

mmichel11 · 2025-02-11T02:30:21Z

Are we losing coverage on full kernels by shrinking this space or are we still at least hitting all the same kernels on debug?

On integrated graphics, we will be hitting all of the kernels in debug mode which make up most of our CI testing. However, on PVC / BMG we will not. I had experimented with a cap of 2 million which would ensure this for PVC, but I saw very long times with debug builds on battlemage (500 - 600 seconds per test).

Moving to draft while I test a new strategy on the following branch that addresses the point @danhoeflinger brought up: https://github.com/uxlfoundation/oneDPL/compare/dev/mmichel11/pfor_debug_input_size_reduce_tmp

Signed-off-by: Matthew Michel <[email protected]>

mmichel11 · 2025-02-11T14:56:05Z

Are we losing coverage on full kernels by shrinking this space or are we still at least hitting all the same kernels on debug?

@danhoeflinger This issue has been addressed. get_pattern_for_max_n is replaced with get_pattern_for_test_sizes which returns a sequence of elements to test with. In the case of debug mode, we inject an input past the estimated upper-bound threshold for the large submitter. Release build test inputs are unchanged. This fixes the BMG / Arc timeout issue without compromising testing in the debug case.

dmitriy-sobolev

The PR looks good enough to me leaving aside some minor comments.

test/support/utils.h

Signed-off-by: Matthew Michel <[email protected]>

test/support/utils.h

Signed-off-by: Matthew Michel <[email protected]>

dmitriy-sobolev

LGTM

danhoeflinger

This seems like a good strategy with some flexibility going forward, allowing us to keep general coverage on debug builds. LGTM

mmichel11 added 3 commits February 10, 2025 13:46

Reduce for algorithm testing with debug builds to prevent timeouts

64dc966

Signed-off-by: Matthew Michel <[email protected]>

Experiment raising debug cap to 2M elements

75842f1

Signed-off-by: Matthew Michel <[email protected]>

Revert "Experiment raising debug cap to 2M elements"

1fe53ec

This reverts commit 75842f1.

mmichel11 requested review from dmitriy-sobolev, danhoeflinger and adamfidel February 10, 2025 21:40

mmichel11 marked this pull request as draft February 11, 2025 02:31

mmichel11 added 2 commits February 11, 2025 08:40

Add new strategy to avoid timeouts

c543935

Signed-off-by: Matthew Michel <[email protected]>

Bugfix in adjacent_difference.pass

732e3a2

Signed-off-by: Matthew Michel <[email protected]>

mmichel11 marked this pull request as ready for review February 11, 2025 14:56

dmitriy-sobolev reviewed Feb 11, 2025

View reviewed changes

test/support/utils.h Show resolved Hide resolved

test/support/utils.h Outdated Show resolved Hide resolved

Rename large_submitter_ub to large_submitter_limit

fb88b09

Signed-off-by: Matthew Michel <[email protected]>

dmitriy-sobolev reviewed Feb 11, 2025

View reviewed changes

test/support/utils.h Outdated Show resolved Hide resolved

Address review comments

10ff102

Signed-off-by: Matthew Michel <[email protected]>

dmitriy-sobolev approved these changes Feb 11, 2025

View reviewed changes

danhoeflinger approved these changes Feb 11, 2025

View reviewed changes

mmichel11 merged commit 4f216b4 into main Feb 11, 2025
22 checks passed

mmichel11 deleted the dev/mmichel11/pfor_debug_input_size_reduce branch February 11, 2025 20:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[test] Reduce for algorithm testing with debug builds to prevent timeouts #2058

[test] Reduce for algorithm testing with debug builds to prevent timeouts #2058

mmichel11 commented Feb 10, 2025 •

edited

Loading

danhoeflinger commented Feb 10, 2025

mmichel11 commented Feb 10, 2025 •

edited

Loading

mmichel11 commented Feb 11, 2025 •

edited

Loading

mmichel11 commented Feb 11, 2025

dmitriy-sobolev left a comment

dmitriy-sobolev left a comment

danhoeflinger left a comment

[test] Reduce for algorithm testing with debug builds to prevent timeouts #2058

[test] Reduce for algorithm testing with debug builds to prevent timeouts #2058

Conversation

mmichel11 commented Feb 10, 2025 • edited Loading

Description

Details

danhoeflinger commented Feb 10, 2025

mmichel11 commented Feb 10, 2025 • edited Loading

mmichel11 commented Feb 11, 2025 • edited Loading

mmichel11 commented Feb 11, 2025

dmitriy-sobolev left a comment

Choose a reason for hiding this comment

dmitriy-sobolev left a comment

Choose a reason for hiding this comment

danhoeflinger left a comment

Choose a reason for hiding this comment

mmichel11 commented Feb 10, 2025 •

edited

Loading

mmichel11 commented Feb 10, 2025 •

edited

Loading

mmichel11 commented Feb 11, 2025 •

edited

Loading