[SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter#21944
Closed
ldorau wants to merge 9 commits into
Closed
[SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter#21944ldorau wants to merge 9 commits into
ldorau wants to merge 9 commits into
Conversation
07849b5 to
9bdacf8
Compare
- Skip peers with disabled P2P in makeProvider (USM pool creation) - Add urUsmP2PEnablePeerAccessExp / urUsmP2PDisablePeerAccessExp - Track per-device peer status in ur_device_handle_t_::peers[] - Update existing USM pool residency on P2P enable/disable Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
- Fill in three placeholder multi-device tests in memory_residency.cpp - Tests verify P2P-driven residency: absent-on-peer without P2P, enable/disable state machine checks, end-to-end data transfer Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
9bdacf8 to
16546c8
Compare
lslusarczyk
reviewed
May 11, 2026
Extract common logic from ext_oneapi_enable_peer_access and ext_oneapi_disable_peer_access into a templated p2pAccessHelper function to avoid code duplication. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The disablePeerAccessStateMachineAndSourceAllocationPersists test was failing intermittently because deferred frees from the preceding test complete asynchronously, causing UR_DEVICE_INFO_GLOBAL_MEM_FREE to report more free memory than the baseline captured at the start of the test. Remove the unreliable source-device free-memory assertion and the allocation it required, keeping only the state-machine checks (disable succeeds, double-disable returns UR_RESULT_ERROR_INVALID_OPERATION). The source-device allocation property is already covered by allocatingDeviceMemoryWillResultInOOM which runs first in isolation.
16546c8 to
51cabf7
Compare
51cabf7 to
6f5deb6
Compare
…oreEnablingPeerAccess Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Add a P2P peer-status check in command_list_manager::appendUSMMemcpy that queries the source pointer's owning device via zeMemGetAllocProperties. If the source is device memory on a different device, the adapter verifies that the source device's peer table grants access to the queue's device (srcDevice->peers[queueDevice->Id] == ENABLED). Returns UR_RESULT_ERROR_INVALID_OPERATION if P2P access has not been enabled. Previously, zeCommandListAppendMemoryCopy would silently succeed for cross-device copies via the copy engine regardless of P2P state, making it impossible to test that ext_oneapi_disable_peer_access actually revokes access. Also adds negative-pair tests that verify urEnqueueUSMMemcpy fails when P2P is disabled: - enablePeerAccessStateMachineAndSourceAllocationFailsWithoutP2P - p2pReadFailsWithoutPeerAccessDisabled Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The peer table lives on peerDevice: peerDevice->peers[commandDevice->Id] tracks whether commandDevice is allowed to access peerDevice's allocations. Update urUsmP2PChangePeerAccessExp to lock peerDevice's mutex, read/write peerDevice's peer table, use peerDevice's platform for context iteration, and pass (peerDevice, commandDevice) to changeResidentDevice and validateP2PDevicePair. Also fix urUsmP2PPeerAccessGetInfoExp to query the peer table on peerDevice rather than commandDevice. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Adds sycl/test-e2e/USM/P2P/p2p_usm_residency.cpp to verify that the Level Zero v2 adapter restricts USM device memory residency to only those peer devices for which P2P access has been explicitly enabled via ext_oneapi_enable_peer_access. Phase 1 (P2P disabled): allocates 1 MB on dev0 and checks that dev1 free memory does not decrease, proving the allocation is not made resident on dev1. Phase 2 (P2P enabled): allocates 1 MB on dev0 and checks that dev1 free memory decreases by at least the allocation size, proving the allocation is resident on dev1. Also adds the 'two-or-more-gpu-devices' lit feature to lit.cfg.py, set when sycl-ls reports at least two GPU devices. The test uses this feature to skip on single-GPU machines. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
…isable Add Phase 3 to p2p_usm_residency.cpp that enables then disables P2P access from dev1 to dev0, then attempts a memcpy via dev1's queue. The test passes if the memcpy throws an exception or if the copied data does not match the original fill pattern, confirming that ext_oneapi_disable_peer_access actually revokes access.
Contributor
Author
|
Incorporated into #21889 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
It adds a SYCL e2e test to #21889