Skip to content

Conversation

@jusiskin
Copy link
Contributor

@jusiskin jusiskin commented Nov 15, 2024

What was the problem/requirement? (What/Why)

Under low memory situations on Linux, the worker agent can fail to cancel a running session action.

On Linux, processes can only send signals to other processes if at least one of the following are true:

  • the target process is owned by the same OS user as the signalling process OR
  • if the process sending the signal

To overcome this limitation, openjd-sessions used sudo to create a bash process as the target user which then sends the OS signal. When the system has low memory, there may not be sufficient memory to create the subprocesses successfully.

What was the solution? (How)

OpenJobDescription/openjd-sessions-for-python#196 is a work-in-progress enhancement to leverage CAP_KILL and send signals to cross-user processes without creating subprocesses.

This PR builds on this support for CAP_KILL so the worker agent directly signals cross-user processes.

  1. The code for install-deadline-worker was modified to configure the worker agent systemd unit with the CAP_KILL ambient capability. This has the effect of adding CAP_KILL to the worker agent process' permitted/effective/inheritable capability sets.
  2. The worker agent now drops CAP_KILL from the inheritable capability set early in the program startup so that other threads and subprocesses will not inherit this privileged capability.
  3. The openjd-session change takes care of the rest and handles direct signalling for cross-user processes.

What is the impact of this change?

This change allows the worker agent to cancel session actions more robustly in low-memory situations.

How was this change tested?

The cross-user direct signal cancelation tests live in openjd-sessions, but a security end-to-end test was added to ensure that session actions are not able to signal other processes.

Was this change documented?

No

Is this a breaking change?

No


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jusiskin jusiskin added enhancement New feature or request security Pull requests that could impact security labels Nov 15, 2024
@jusiskin jusiskin changed the title Linux direct signals cap kill feat: more robust Linux cross-user cancelation under low memory conditions Nov 15, 2024
@jusiskin jusiskin force-pushed the linux_direct_signals_cap_kill branch 4 times, most recently from d9cb534 to 04a236e Compare November 22, 2024 16:24
@jusiskin jusiskin marked this pull request as ready for review November 22, 2024 16:24
@jusiskin jusiskin requested a review from a team as a code owner November 22, 2024 16:24
@ddneilson ddneilson self-requested a review November 22, 2024 17:10
@ddneilson
Copy link
Contributor

Oh, also, the README should probably be updated to list the (optional) dependency on libcap.so on linux

@jusiskin jusiskin force-pushed the linux_direct_signals_cap_kill branch from 04a236e to beb0ff7 Compare November 26, 2024 15:59
@jusiskin jusiskin force-pushed the linux_direct_signals_cap_kill branch from beb0ff7 to cdd677e Compare November 26, 2024 16:23
@sonarqubecloud
Copy link

Copy link
Contributor

@ddneilson ddneilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on, Josh. Looks great!

This was referenced Jun 24, 2025
This was referenced Jul 23, 2025
This was referenced Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request security Pull requests that could impact security

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants