cred: properly pass and test creds on other threads #17273

robn · 2025-04-24T22:27:25Z

Motivation and Context

I claim, perhaps incorrectly, that we aren't currently holding CRED() properly when we need to hand it off to another task. If so, this should fix that. It also lets us simplify how we do cross-task cred checks on Linux, avoiding the need to test the permissions of another task directly. This is good, because we lose the API to do that in 6.15 (see #17229).

Description

Commit message follows here. This all took a while to understand, so I consider this a form of documentation (perhaps better done in comments somewhere, but here we are).

Background

Various admin operations will be invoked by some userspace task, but the work will be done on a separate kernel thread at a later time. Snapshots are an example, which are triggered through zfs_ioc_snapshot() -> dsl_dataset_snapshot(), but the actual work is from a task dispatched to dp_sync_taskq.

Many such tasks end up in dsl_enforce_ds_ss_limits(), where various limits and permissions are enforced. Among other things, it is necessary to ensure that the invoking task (that is, the user) has permission to do things. We can't simply check if the running task has permission; it is a privileged kernel thread, which can do anything.

However, in the general case it's not safe to simply query the task for its permissions at the check time, as the task may not exist any more, or its permissions may have changed since it was first invoked. So instead, we capture the permissions by saving CRED() in the user task, and then using it for the check through the secpolicy_* functions.

Current implementation

The current code calls CRED() to get the credential, which gets a pointer to the cred_t inside the current task and passes it to the worker task. However, it doesn't take a reference to the cred_t, and so expects that it won't change, and that the task continues to exist. In practice that is always the case, because we don't let the calling task return from the kernel until the work is done.

For Linux, we also take a reference to the current task, because the Linux credential APIs for the most part do not check an arbitrary credential, but rather, query what a task can do. See secpolicy_zfs_proc(). Again, we don't take a reference on the task, just a pointer to it.

Changes

We change to calling crhold() on the task credential, and crfree() when we're done with it. This ensures it stays alive and unchanged for the duration of the call.

On the Linux side, we change the main policy checking function priv_policy_ns() to use override_creds()/revert_creds() if necessary to make the provided credential active in the current task, allowing the standard task-permission APIs to do the needed check. Since the task pointer is no longer required, this lets us entirely remove secpolicy_zfs_proc() and the need to carry a task pointer around as well.

Discussion

The main question in all of this is whether crhold() is sufficient to keep a credential alive and unchanged, even if the calling task goes away.

I believe this is true for both FreeBSD and Linux, as they appear to have the same model: a credential is considered immutable, and modifications are done by making a copy, changing it, then atomically swapping it into place. The credential is only freed when the last reference is dropped. (This comes from reading ucred(9) on FreeBSD, Documentation/security/credentials.rst on Linux, and reading as much kernel code as I could handle for both).

If I'm wrong, and the credential is still bound to the task lifetime, then we'll need to do something more complex, using crdup() (which on Linux would be implemented with prepare_creds(), I think). But since we're never modifying the credential, my feeling is that we probably don't need to do this.

How Has This Been Tested?

Compile checked on Linux 4.19 -> 6.14, and FreeBSD 14.

Successful ZTS run completed on 6.1.x.

However, I am not sure that we have any tests that properly exercise this, so I don't know if it's actually better, or even working properly.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the OpenZFS code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
I have run the ZFS Test Suite with this change applied.
All commit messages are properly formatted and contain Signed-off-by.

robn · 2025-04-24T22:28:20Z

@snajpa would appreciate you taking a look (couldn't request a review from you, hrm). Thanks!

snajpa

Yes, accessing current->cred doesn't need anything special and increased reference ensures it won't go away while we need it. This is the proper way to do it, thank you.

snajpa · 2025-04-25T12:16:26Z

@snajpa would appreciate you taking a look (couldn't request a review from you, hrm). Thanks!

I don't think I can influence that (?) AFAIK I haven't clicked on anything that would disallow pinging me for reviews, might be the project's settings, don't know.

module/os/linux/zfs/policy.c

module/zfs/dmu_recv.c

### Background Various admin operations will be invoked by some userspace task, but the work will be done on a separate kernel thread at a later time. Snapshots are an example, which are triggered through zfs_ioc_snapshot() -> dsl_dataset_snapshot(), but the actual work is from a task dispatched to dp_sync_taskq. Many such tasks end up in dsl_enforce_ds_ss_limits(), where various limits and permissions are enforced. Among other things, it is necessary to ensure that the invoking task (that is, the user) has permission to do things. We can't simply check if the running task has permission; it is a privileged kernel thread, which can do anything. However, in the general case it's not safe to simply query the task for its permissions at the check time, as the task may not exist any more, or its permissions may have changed since it was first invoked. So instead, we capture the permissions by saving CRED() in the user task, and then using it for the check through the secpolicy_* functions. ### Current implementation The current code calls CRED() to get the credential, which gets a pointer to the cred_t inside the current task and passes it to the worker task. However, it doesn't take a reference to the cred_t, and so expects that it won't change, and that the task continues to exist. In practice that is always the case, because we don't let the calling task return from the kernel until the work is done. For Linux, we also take a reference to the current task, because the Linux credential APIs for the most part do not check an arbitrary credential, but rather, query what a task can do. See secpolicy_zfs_proc(). Again, we don't take a reference on the task, just a pointer to it. ### Changes We change to calling crhold() on the task credential, and crfree() when we're done with it. This ensures it stays alive and unchanged for the duration of the call. On the Linux side, we change the main policy checking function priv_policy_ns() to use override_creds()/revert_creds() if necessary to make the provided credential active in the current task, allowing the standard task-permission APIs to do the needed check. Since the task pointer is no longer required, this lets us entirely remove secpolicy_zfs_proc() and the need to carry a task pointer around as well. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>

amotin

Makes sense for my unqualified in the area eye.

robn requested review from behlendorf and amotin April 24, 2025 22:27

snajpa approved these changes Apr 25, 2025

View reviewed changes

amotin reviewed Apr 25, 2025

View reviewed changes

module/os/linux/zfs/policy.c Show resolved Hide resolved

module/zfs/dmu_recv.c Show resolved Hide resolved

module/zfs/dmu_recv.c Show resolved Hide resolved

robn force-pushed the cred-passing branch from c660fb7 to ba97334 Compare April 26, 2025 05:56

amotin approved these changes Apr 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cred: properly pass and test creds on other threads #17273

cred: properly pass and test creds on other threads #17273

robn commented Apr 24, 2025

robn commented Apr 24, 2025

snajpa left a comment

snajpa commented Apr 25, 2025

amotin left a comment

cred: properly pass and test creds on other threads #17273

Are you sure you want to change the base?

cred: properly pass and test creds on other threads #17273

Conversation

robn commented Apr 24, 2025

Motivation and Context

Description

Background

Current implementation

Changes

Discussion

How Has This Been Tested?

Types of changes

Checklist:

robn commented Apr 24, 2025

snajpa left a comment

Choose a reason for hiding this comment

snajpa commented Apr 25, 2025

amotin left a comment

Choose a reason for hiding this comment