Skip to content

cred: properly pass and test creds on other threads #17273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

robn
Copy link
Member

@robn robn commented Apr 24, 2025

Motivation and Context

I claim, perhaps incorrectly, that we aren't currently holding CRED() properly when we need to hand it off to another task. If so, this should fix that. It also lets us simplify how we do cross-task cred checks on Linux, avoiding the need to test the permissions of another task directly. This is good, because we lose the API to do that in 6.15 (see #17229).

Description

Commit message follows here. This all took a while to understand, so I consider this a form of documentation (perhaps better done in comments somewhere, but here we are).

Background

Various admin operations will be invoked by some userspace task, but the work will be done on a separate kernel thread at a later time. Snapshots are an example, which are triggered through zfs_ioc_snapshot() -> dsl_dataset_snapshot(), but the actual work is from a task dispatched to dp_sync_taskq.

Many such tasks end up in dsl_enforce_ds_ss_limits(), where various limits and permissions are enforced. Among other things, it is necessary to ensure that the invoking task (that is, the user) has permission to do things. We can't simply check if the running task has permission; it is a privileged kernel thread, which can do anything.

However, in the general case it's not safe to simply query the task for its permissions at the check time, as the task may not exist any more, or its permissions may have changed since it was first invoked. So instead, we capture the permissions by saving CRED() in the user task, and then using it for the check through the secpolicy_* functions.

Current implementation

The current code calls CRED() to get the credential, which gets a pointer to the cred_t inside the current task and passes it to the worker task. However, it doesn't take a reference to the cred_t, and so expects that it won't change, and that the task continues to exist. In practice that is always the case, because we don't let the calling task return from the kernel until the work is done.

For Linux, we also take a reference to the current task, because the Linux credential APIs for the most part do not check an arbitrary credential, but rather, query what a task can do. See secpolicy_zfs_proc(). Again, we don't take a reference on the task, just a pointer to it.

Changes

We change to calling crhold() on the task credential, and crfree() when we're done with it. This ensures it stays alive and unchanged for the duration of the call.

On the Linux side, we change the main policy checking function priv_policy_ns() to use override_creds()/revert_creds() if necessary to make the provided credential active in the current task, allowing the standard task-permission APIs to do the needed check. Since the task pointer is no longer required, this lets us entirely remove secpolicy_zfs_proc() and the need to carry a task pointer around as well.

Discussion

The main question in all of this is whether crhold() is sufficient to keep a credential alive and unchanged, even if the calling task goes away.

I believe this is true for both FreeBSD and Linux, as they appear to have the same model: a credential is considered immutable, and modifications are done by making a copy, changing it, then atomically swapping it into place. The credential is only freed when the last reference is dropped. (This comes from reading ucred(9) on FreeBSD, Documentation/security/credentials.rst on Linux, and reading as much kernel code as I could handle for both).

If I'm wrong, and the credential is still bound to the task lifetime, then we'll need to do something more complex, using crdup() (which on Linux would be implemented with prepare_creds(), I think). But since we're never modifying the credential, my feeling is that we probably don't need to do this.

How Has This Been Tested?

Compile checked on Linux 4.19 -> 6.14, and FreeBSD 14.

Successful ZTS run completed on 6.1.x.

However, I am not sure that we have any tests that properly exercise this, so I don't know if it's actually better, or even working properly.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@robn robn requested review from behlendorf and amotin April 24, 2025 22:27
@robn
Copy link
Member Author

robn commented Apr 24, 2025

@snajpa would appreciate you taking a look (couldn't request a review from you, hrm). Thanks!

Copy link
Contributor

@snajpa snajpa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, accessing current->cred doesn't need anything special and increased reference ensures it won't go away while we need it. This is the proper way to do it, thank you.

@snajpa
Copy link
Contributor

snajpa commented Apr 25, 2025

@snajpa would appreciate you taking a look (couldn't request a review from you, hrm). Thanks!

I don't think I can influence that (?) AFAIK I haven't clicked on anything that would disallow pinging me for reviews, might be the project's settings, don't know.

### Background

Various admin operations will be invoked by some userspace task, but the
work will be done on a separate kernel thread at a later time. Snapshots
are an example, which are triggered through zfs_ioc_snapshot() ->
dsl_dataset_snapshot(), but the actual work is from a task dispatched to
dp_sync_taskq.

Many such tasks end up in dsl_enforce_ds_ss_limits(), where various
limits and permissions are enforced. Among other things, it is necessary
to ensure that the invoking task (that is, the user) has permission to
do things. We can't simply check if the running task has permission; it
is a privileged kernel thread, which can do anything.

However, in the general case it's not safe to simply query the task for
its permissions at the check time, as the task may not exist any more,
or its permissions may have changed since it was first invoked. So
instead, we capture the permissions by saving CRED() in the user task,
and then using it for the check through the secpolicy_* functions.

### Current implementation

The current code calls CRED() to get the credential, which gets a
pointer to the cred_t inside the current task and passes it to the
worker task. However, it doesn't take a reference to the cred_t, and so
expects that it won't change, and that the task continues to exist. In
practice that is always the case, because we don't let the calling task
return from the kernel until the work is done.

For Linux, we also take a reference to the current task, because the
Linux credential APIs for the most part do not check an arbitrary
credential, but rather, query what a task can do. See
secpolicy_zfs_proc(). Again, we don't take a reference on the task, just
a pointer to it.

### Changes

We change to calling crhold() on the task credential, and crfree() when
we're done with it. This ensures it stays alive and unchanged for the
duration of the call.

On the Linux side, we change the main policy checking function
priv_policy_ns() to use override_creds()/revert_creds() if necessary to
make the provided credential active in the current task, allowing the
standard task-permission APIs to do the needed check. Since the task
pointer is no longer required, this lets us entirely remove
secpolicy_zfs_proc() and the need to carry a task pointer around as
well.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <[email protected]>
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense for my unqualified in the area eye.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants