dsl_dir: avoid dd_lock during snapshots_changed updates#18472
dsl_dir: avoid dd_lock during snapshots_changed updates#18472Gality369 wants to merge 1 commit intoopenzfs:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a lock-order inversion involving dd->dd_lock and ZAP updates during snapshot “snapshots_changed” timestamp persistence by moving the zap_update() call out from under dd_lock while preserving correctness of the on-disk value.
Changes:
- Stop holding
dd->dd_lockacrosszap_update()when persistingDD_FIELD_SNAPSHOTS_CHANGED. - Keep the in-memory
dd_snap_cmtimeupdate anddsl_dir_zapify()underdd_lock, but perform the ZAP write after unlocking. - Add a retry loop to re-check
dd_snap_cmtimeafter the on-disk update and re-write if it changed during the write.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2911d64 to
297e920
Compare
Avoid holding dd_lock while updating the on-disk snapshots_changed timestamp. Both dsl_dir_zapify() and zap_update() may dirty buffers and recurse into space accounting, which can take dd_lock. Holding dd_lock across either operation can therefore preserve the lock-order inversion reported by lockdep. Only protect the in-memory dd_snap_cmtime update with dd_lock. Perform the zapify and ZAP update without dd_lock held, and retry the on-disk write if another updater advanced dd_snap_cmtime while the write was in progress. Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
297e920 to
ca51bc1
Compare
amotin
left a comment
There was a problem hiding this comment.
I'd personally unwrap the lines now that indentation is reduced, but whatever.
May be we could add ASSERT(dsl_pool_sync_context(dp)) here to make it obvious?
| mutex_exit(&dd->dd_lock); | ||
|
|
||
| mos = dd->dd_pool->dp_meta_objset; | ||
| ddobj = dd->dd_object; |
There was a problem hiding this comment.
nit: Declaring these variables where they're assigned would be nice too. In fact, there's probably no need for ddobj if you unwrap the lines below, dd->dd_object can be used directly. I would keep mos though since it makes it clear zap_update() is operating on the mos object.
objset_t *mos = dd->dd_pool->dp_meta_objset;
Motivation and Context
A tmp snapshot update still holds dd_lock while persisting the
snapshots_changed ZAP entry. Both dsl_dir_zapify() and zap_update() can dirty
buffers and recurse into dsl_dir_willuse_space(), which preserves the
dd_lock -> zap_rwlock lock ordering that lockdep reports.
Description
This change keeps the in-memory dd_snap_cmtime update under dd_lock, but
moves both dsl_dir_zapify() and the DD_FIELD_SNAPSHOTS_CHANGED zap_update()
out from under that lock.
That removes the dd_lock -> zap_rwlock edge involved in the reported
lock-order inversion.
The final version does not use a retry loop. After re-checking the callers,
this path runs in syncing context and the relevant updates are serialized, so
there should not be a concurrent updater for the same dataset that can cause
the on-disk snapshots_changed value to move backwards.
How Has This Been Tested?
Tested on linux.
Types of changes
Checklist:
Signed-off-by.