Fix POSIX backend queue fallback handling#1605
Conversation
Guard POSIX request posting against unavailable I/O queues and let tests use the backend's compiled default queue instead of forcing Linux AIO. Co-authored-by: OpenAI Codex <codex@openai.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds null-checking for the POSIX backend I/O queue in postXfer, fails engine initialization when a requested queue type cannot be instantiated, changes request-handle cleanup to delete allocated objects (and early-return on null), and updates tests and Meson wiring to use and validate the backend's compiled default queue. ChangesPOSIX Backend Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Split POSIX queue initialization errors into separate logs and clarify request-handle deletion by using the POSIX cast only for validation. Co-authored-by: OpenAI Codex <codex@openai.com>
The POSIX backend fallback fix is split into PR #1605, so keep this branch scoped to the LIBFABRIC postXfer thread-pool feature. Co-authored-by: OpenAI Codex <codex@openai.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/unit/plugins/posix/meson.build`:
- Around line 19-24: The meson build only emits HAVE_LINUXAIO and HAVE_LIBURING,
so add the POSIX AIO compile define where posix_test_compile_defs is built:
check the has_posix_aio build variable and append a define like
'-DHAVE_POSIXAIO' alongside the existing has_linux_aio / has_io_uring branches
so nixl_posix_test.cpp can detect POSIXAIO as a supported default queue; apply
the same change to the other occurrence noted around the second block (the one
at 29-29).
In `@test/unit/plugins/posix/nixl_posix_test.cpp`:
- Around line 85-87: The changed blocks have clang-format/style violations;
reformat the modified else/print blocks (e.g., the block that prints "this build
does not include Linux AIO or io_uring support." and the nearby std::endl/stream
output lines) and the other similar regions noted by running the repository's
clang-format configuration or applying the project's formatting script so
spacing, indentation, and line breaks match repo style; ensure you update the
affected statements in nixl_posix_test.cpp (the std::cout/std::cerr + std::endl
lines) to conform to the formatter and re-run CI locally to verify formatting
passes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 02649959-ea59-4880-9214-19fa50222221
📒 Files selected for processing (2)
test/unit/plugins/posix/meson.buildtest/unit/plugins/posix/nixl_posix_test.cpp
Teach the POSIX plugin test which queue implementations were compiled in and reject POSIX AIO-only or unavailable io_uring configurations at startup with a clear error message. Co-authored-by: OpenAI Codex <codex@openai.com>
9741147 to
87b9a84
Compare
|
/build |
|
/build |
Modified to reflect different slurm server for CI Signed-off-by: Ofer Achler <ofer.achler@gmail.com>
Reverted the head node back to the general cluster Signed-off-by: Ofer Achler <ofer.achler@gmail.com>
|
/ok to test 3a8d073 |
|
/build |
1 similar comment
|
/build |
Avoid RTTI in POSIX request-handle paths and mark the defensive missing I/O queue check as unlikely. Co-authored-by: OpenAI Codex <codex@openai.com>
What
Why
Some build environments have POSIX AIO available but not Linux AIO/liburing. The existing test path forced Linux AIO and could hit a null queue, causing a SIGSEGV in nixlPosixBackendReqH::postXfer.
Testing
Split out from the LIBFABRIC postXfer thread-pool PR (#1581) so the POSIX fix can be reviewed independently: #1581
Summary by CodeRabbit
Bug Fixes
Tests
Chores
Chores