Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd client can report error after a successful write #16659

Closed
4 tasks done
seebs opened this issue Sep 27, 2023 · 10 comments
Closed
4 tasks done

etcd client can report error after a successful write #16659

seebs opened this issue Sep 27, 2023 · 10 comments

Comments

@seebs
Copy link

seebs commented Sep 27, 2023

Bug report criteria

What happened?

Under exceedingly rare and possibly-stupid circumstances, a compare/set operation yielded an error, but had in fact actually written, such that retrying it after clearing an alarm condition failed because the stored value was no longer the prior value.

What did you expect to happen?

I expected that if a Txn reported an error, it would not have written anything.

How can we reproduce it (as minimally and precisely as possible)?

Very sorry for not having a clean reproducer, and unfortunately this one is vanishingly rare even under the circumstances where it applies.

The essentials are: (1) do a compare-and-set (2) have the database fill up while you're doing it.

I encountered this in test code validating recovery code that was attempting to handle mvcc: database space exceeded problems. In order to do this, we set our size limit unreasonably small, then ran loops of code roughly like this:

resp, err = cli.Txn(ctx).
	If(clientv3.Compare(clientv3.Value(key), "=", oldValueStr)).
		Then(clientv3.OpPut(key, newValueStr)).Commit()

Our database size was small enough that we were hitting database space exceeded every few hundred iterations, causing us to trigger a Compact operation, and so on. Then we'd retry the operation. And, probably 99% of the time, everything continued smoothly.

Every so often, though, we'd observe:
(1) the database revision is one higher than it was before we tried this
(2) the stored value for that key is now newValueStr

So of course, the CAS wouldn't succeed anymore. We don't get an error back from the next call, but resp.Succeeded is false.

This is in a single-threaded test case, with no actual user, and the etcd in question is an embedded one created for the test case, not a server which is accessible to other users, so we can logically rule out "maybe it failed because someone else attempted the same CAS and theirs succeeded".

What this suggests to me is that Commit() can succeed enough that future recovery will observe its write, but then end up yielding database space exceeded.

Reproducing this might be as simple as just running this in a loop, and if you get a database space exceeded error, run status, get a revision, and request compaction to that revision. It may be relevant that we're specifically requesting compaction to the current revision reported by Status().

Anything else we need to know?

No response

Etcd version (please run commands below)

Not using an external etcd, but go.mod says:
go.etcd.io/etcd/server/v3 v3.5.5

Etcd configuration (command line flags or environment variables)

The relevant part is probably that we're using a 64MB database size in order to make it easy to produce the out of space errors in order to test our recovery code.

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

N/A

Relevant log output

No log output.
@seebs seebs added the type/bug label Sep 27, 2023
@serathius
Copy link
Member

Interesting find. Based on etcd data model I think this is pretty expected. Bbolt writes and storage size is not deterministic, which means that on the same disk size one node can run out of disk while other don't.

I think the issue was never investigated as etcd works around the issue via db size quota. Quota is expected to be always set lower then available disk size with additional buffer. This guarantees that write going out of quota will succeed, but all following request will not.

My recommendation would be to ensure that you set your db quota correctly to prevent this inconsistency.

Still I think it would be interesting to investigate the correctness of etcd behavior while out of quota and out of disk in the robustness tests.

@tjungblu
Copy link
Contributor

I added a failpoint and test for it here some time ago: #16018

If anyone wants to pick this up, feel free.

@ahrtr
Copy link
Member

ahrtr commented Sep 28, 2023

thx for raising this issue. It's because etcd performs the space quota check on both API and applying path.

When the check on API (when receiving Put/Txn/LeaseGrant request) fails, it returns an ErrGRPCNoSpace error, and the request will definitely fail as well in this case.

if qa.q.Available(r) {
return nil
}
req := &pb.AlarmRequest{
MemberID: uint64(qa.id),
Action: pb.AlarmRequest_ACTIVATE,
Alarm: pb.AlarmType_NOSPACE,
}
qa.a.Alarm(ctx, req)
return rpctypes.ErrGRPCNoSpace

But it's possible that the API check passes, but the space quota check on applying path (see below) may fail. In this case, etcd will still apply the TXN, so it's the reason why the writing was successful, but the client got an error response.

ok := a.q.Available(rt)
resp, trace, err := a.applierV3.Txn(ctx, rt)
if err == nil && !ok {
err = errors.ErrNoSpace
}

No matter which check fails, it generates a NOSPACE alarm,

a := &pb.AlarmRequest{
MemberID: uint64(s.MemberId()),
Action: pb.AlarmRequest_ACTIVATE,
Alarm: pb.AlarmType_NOSPACE,
}

I think we should

  • either update the doc to clearly clarify this.
  • or remove the quota check on the applying path. We only perform the quota check on the API layer.

@seebs
Copy link
Author

seebs commented Oct 2, 2023

Intuitively, if the quota check is going to generate an alarm, I'd expect the error return to happen without trying the Txn.

@serathius
Copy link
Member

or remove the quota check on the applying path. We only perform the quota check on the API layer.

Don't agree with that. Quota check on the apply path is the important one.

@jmhbnz
Copy link
Member

jmhbnz commented Nov 7, 2023

Hey @ahrtr - Where in the documentation would you suggest we outline this behavior? This issue could be a good candidate for tomorrow's ContribFest if we can nail down where we want to document the behavior 🙏🏻

@ahrtr
Copy link
Member

ahrtr commented Nov 7, 2023

Hey @ahrtr - Where in the documentation would you suggest we outline this behavior?

I would suggest to get it clearly clarified in https://etcd.io/docs/v3.5/op-guide/maintenance/

This issue could be a good candidate for tomorrow's ContribFest

Sounds good .

Copy link

stale bot commented Mar 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 17, 2024
@siyuanfoundation siyuanfoundation self-assigned this May 9, 2024
@stale stale bot removed the stale label May 9, 2024
@jmhbnz
Copy link
Member

jmhbnz commented May 9, 2024

Discussed during sig-etcd triage meeting, this got missed at last contribfest so is still a valid issue.

/assign @siyuanfoundation

@siyuanfoundation
Copy link
Contributor

added the case to documentation. Closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants