Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #63: invalid reads after write #64

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

floren
Copy link
Contributor

@floren floren commented May 6, 2021

Per our discussion on #62, this makes all writes atomic.

floren added 2 commits May 6, 2021 12:50
…after write.

This reworks the createKeyFileWithLock function (used only by
writeStreamWithLock) so it always writes to a temporary file, which
is renamed by writeStreamWithLock when complete. A deferred recover is
added to clean up the temporary file if writing fails. This essentially
deprecates the TempDir option by making all writes atomic.
diskv.go Outdated Show resolved Hide resolved
diskv.go Outdated Show resolved Hide resolved
diskv.go Outdated Show resolved Hide resolved
issues_test.go Outdated Show resolved Hide resolved
diskv.go Outdated Show resolved Hide resolved
… BasePath.

The key walking functions have been modified to ignore everything inside that directory, so getting a list of keys while a WriteStream is in progress will not include the temporary files.
@floren
Copy link
Contributor Author

floren commented Aug 18, 2021

PTAL. I've overhauled the way we handle atomic write temp files by jamming them into a (hidden) subdirectory. The name of this directory is configurable; anything within that directory will be ignored when listing keys.

@peterbourgon
Copy link
Owner

Are writes not atomic if you set TempDir to the same physical disk as the diskv path?

@floren
Copy link
Contributor Author

floren commented Aug 19, 2021

They are. The problem is that if you don't set TempDir, ReadStream is broken. You will read inconsistent data if someone sneaks in a Write before you finish reading. So the current situation is basically "Set both BaseDir and TempDir unless you want diskv to be broken, but we won't warn you if you disregard TempDir".

I've put quite a few hours into fixing this now. If you're not interested in the fix, feel free to close the issue and we will continue to maintain and use my fork and not bother you with further PRs.

@peterbourgon
Copy link
Owner

Unless I fundamentally misunderstand the semantics of modern filesystems — which is possible! — an open os.File should be a snapshot of its contents when opened, and should not change if, while open, a different process or thread modifies the file on disk. Is that not true?

@floren
Copy link
Contributor Author

floren commented Aug 19, 2021

Copy TestIssue63 from the branch into the same file in master. It should fail--it did on my Linux box, anyway. It demonstrates that what is read from an already-opened file will indeed change if the file on disk is changed.

@peterbourgon
Copy link
Owner

Ah, yes, I see, it's a consequence of the TempDir decisionmaking and the mode flags we pass.

Solution here is to enforce a default TempDir in the diskv path if none is provided, rather than a separate option altogether, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants