Skip to content

[SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter#21944

Closed
ldorau wants to merge 9 commits into
intel:syclfrom
ldorau:SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter
Closed

[SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter#21944
ldorau wants to merge 9 commits into
intel:syclfrom
ldorau:SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter

Conversation

@ldorau
Copy link
Copy Markdown
Contributor

@ldorau ldorau commented May 6, 2026

It adds a SYCL e2e test to #21889

@ldorau ldorau force-pushed the SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter branch 2 times, most recently from 07849b5 to 9bdacf8 Compare May 7, 2026 08:11
ldorau added 2 commits May 7, 2026 09:09
- Skip peers with disabled P2P in makeProvider (USM pool creation)
- Add urUsmP2PEnablePeerAccessExp / urUsmP2PDisablePeerAccessExp
- Track per-device peer status in ur_device_handle_t_::peers[]
- Update existing USM pool residency on P2P enable/disable

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
- Fill in three placeholder multi-device tests in memory_residency.cpp
- Tests verify P2P-driven residency: absent-on-peer without P2P,
  enable/disable state machine checks, end-to-end data transfer

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
@ldorau ldorau force-pushed the SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter branch from 9bdacf8 to 16546c8 Compare May 8, 2026 09:30
@ldorau ldorau changed the title [DRAFT] [SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter [SYCL][E2E][USM] Add P2P USM residency test for L0 v2 adapter May 11, 2026
Comment thread sycl/test-e2e/USM/P2P/p2p_usm_residency.cpp
ldorau added 2 commits May 11, 2026 13:07
Extract common logic from ext_oneapi_enable_peer_access and
ext_oneapi_disable_peer_access into a templated p2pAccessHelper
function to avoid code duplication.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The disablePeerAccessStateMachineAndSourceAllocationPersists test was
failing intermittently because deferred frees from the preceding test
complete asynchronously, causing UR_DEVICE_INFO_GLOBAL_MEM_FREE to
report more free memory than the baseline captured at the start of the
test.

Remove the unreliable source-device free-memory assertion and the
allocation it required, keeping only the state-machine checks (disable
succeeds, double-disable returns UR_RESULT_ERROR_INVALID_OPERATION).
The source-device allocation property is already covered by
allocatingDeviceMemoryWillResultInOOM which runs first in isolation.
@ldorau ldorau force-pushed the SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter branch from 16546c8 to 51cabf7 Compare May 11, 2026 14:58
@ldorau ldorau requested a review from lslusarczyk May 11, 2026 15:00
@ldorau ldorau force-pushed the SYCLE2EUSM_Add_P2P_USM_residency_test_for_L0_v2_adapter branch 2 times, most recently from 51cabf7 to 6f5deb6 Compare May 12, 2026 13:44
ldorau added 5 commits May 12, 2026 14:41
…oreEnablingPeerAccess

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Add a P2P peer-status check in command_list_manager::appendUSMMemcpy
that queries the source pointer's owning device via
zeMemGetAllocProperties.  If the source is device memory on a different
device, the adapter verifies that the source device's peer table grants
access to the queue's device (srcDevice->peers[queueDevice->Id] ==
ENABLED).  Returns UR_RESULT_ERROR_INVALID_OPERATION if P2P access has
not been enabled.

Previously, zeCommandListAppendMemoryCopy would silently succeed for
cross-device copies via the copy engine regardless of P2P state, making
it impossible to test that ext_oneapi_disable_peer_access actually
revokes access.

Also adds negative-pair tests that verify urEnqueueUSMMemcpy fails when
P2P is disabled:
- enablePeerAccessStateMachineAndSourceAllocationFailsWithoutP2P
- p2pReadFailsWithoutPeerAccessDisabled

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The peer table lives on peerDevice: peerDevice->peers[commandDevice->Id]
tracks whether commandDevice is allowed to access peerDevice's
allocations.

Update urUsmP2PChangePeerAccessExp to lock peerDevice's mutex,
read/write peerDevice's peer table, use peerDevice's platform for
context iteration, and pass (peerDevice, commandDevice) to
changeResidentDevice and validateP2PDevicePair.

Also fix urUsmP2PPeerAccessGetInfoExp to query the peer table on
peerDevice rather than commandDevice.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Adds sycl/test-e2e/USM/P2P/p2p_usm_residency.cpp to verify that
the Level Zero v2 adapter restricts USM device memory residency to
only those peer devices for which P2P access has been explicitly
enabled via ext_oneapi_enable_peer_access.

Phase 1 (P2P disabled): allocates 1 MB on dev0 and checks that
dev1 free memory does not decrease, proving the allocation is not
made resident on dev1.

Phase 2 (P2P enabled): allocates 1 MB on dev0 and checks that
dev1 free memory decreases by at least the allocation size,
proving the allocation is resident on dev1.

Also adds the 'two-or-more-gpu-devices' lit feature to
lit.cfg.py, set when sycl-ls reports at least two GPU devices.
The test uses this feature to skip on single-GPU machines.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
…isable

Add Phase 3 to p2p_usm_residency.cpp that enables then disables P2P
access from dev1 to dev0, then attempts a memcpy via dev1's queue.
The test passes if the memcpy throws an exception or if the copied
data does not match the original fill pattern, confirming that
ext_oneapi_disable_peer_access actually revokes access.
@ldorau
Copy link
Copy Markdown
Contributor Author

ldorau commented May 13, 2026

Incorporated into #21889

@ldorau ldorau closed this May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants