etcd client can report error after a successful write #16659

seebs · 2023-09-27T16:47:30Z

Bug report criteria

This bug report is not security related, security issues should be disclosed privately via [email protected].
This is not a support request, support requests should be raised in the etcd discussion forums.
You have read the etcd bug reporting guidelines.
Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

Under exceedingly rare and possibly-stupid circumstances, a compare/set operation yielded an error, but had in fact actually written, such that retrying it after clearing an alarm condition failed because the stored value was no longer the prior value.

What did you expect to happen?

I expected that if a Txn reported an error, it would not have written anything.

How can we reproduce it (as minimally and precisely as possible)?

Very sorry for not having a clean reproducer, and unfortunately this one is vanishingly rare even under the circumstances where it applies.

The essentials are: (1) do a compare-and-set (2) have the database fill up while you're doing it.

I encountered this in test code validating recovery code that was attempting to handle mvcc: database space exceeded problems. In order to do this, we set our size limit unreasonably small, then ran loops of code roughly like this:

resp, err = cli.Txn(ctx).
	If(clientv3.Compare(clientv3.Value(key), "=", oldValueStr)).
		Then(clientv3.OpPut(key, newValueStr)).Commit()

Our database size was small enough that we were hitting database space exceeded every few hundred iterations, causing us to trigger a Compact operation, and so on. Then we'd retry the operation. And, probably 99% of the time, everything continued smoothly.

Every so often, though, we'd observe:
(1) the database revision is one higher than it was before we tried this
(2) the stored value for that key is now newValueStr

So of course, the CAS wouldn't succeed anymore. We don't get an error back from the next call, but resp.Succeeded is false.

This is in a single-threaded test case, with no actual user, and the etcd in question is an embedded one created for the test case, not a server which is accessible to other users, so we can logically rule out "maybe it failed because someone else attempted the same CAS and theirs succeeded".

What this suggests to me is that Commit() can succeed enough that future recovery will observe its write, but then end up yielding database space exceeded.

Reproducing this might be as simple as just running this in a loop, and if you get a database space exceeded error, run status, get a revision, and request compaction to that revision. It may be relevant that we're specifically requesting compaction to the current revision reported by Status().

Anything else we need to know?

No response

Etcd version (please run commands below)

Not using an external etcd, but go.mod says:
go.etcd.io/etcd/server/v3 v3.5.5

Etcd configuration (command line flags or environment variables)

The relevant part is probably that we're using a 64MB database size in order to make it easy to produce the out of space errors in order to test our recovery code.

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

N/A

Relevant log output

No log output.

The text was updated successfully, but these errors were encountered:

serathius · 2023-09-28T09:41:58Z

Interesting find. Based on etcd data model I think this is pretty expected. Bbolt writes and storage size is not deterministic, which means that on the same disk size one node can run out of disk while other don't.

I think the issue was never investigated as etcd works around the issue via db size quota. Quota is expected to be always set lower then available disk size with additional buffer. This guarantees that write going out of quota will succeed, but all following request will not.

My recommendation would be to ensure that you set your db quota correctly to prevent this inconsistency.

Still I think it would be interesting to investigate the correctness of etcd behavior while out of quota and out of disk in the robustness tests.

tjungblu · 2023-09-28T10:06:27Z

I added a failpoint and test for it here some time ago: #16018

If anyone wants to pick this up, feel free.

ahrtr · 2023-09-28T13:21:20Z

thx for raising this issue. It's because etcd performs the space quota check on both API and applying path.

When the check on API (when receiving Put/Txn/LeaseGrant request) fails, it returns an ErrGRPCNoSpace error, and the request will definitely fail as well in this case.

etcd/server/etcdserver/api/v3rpc/quota.go

Lines 41 to 50 in d92d37b

    
           if qa.q.Available(r) { 
        
           	return nil 
        
           } 
        
           req := &pb.AlarmRequest{ 
        
           	MemberID: uint64(qa.id), 
        
           	Action:   pb.AlarmRequest_ACTIVATE, 
        
           	Alarm:    pb.AlarmType_NOSPACE, 
        
           } 
        
           qa.a.Alarm(ctx, req) 
        
           return rpctypes.ErrGRPCNoSpace

But it's possible that the API check passes, but the space quota check on applying path (see below) may fail. In this case, etcd will still apply the TXN, so it's the reason why the writing was successful, but the client got an error response.

etcd/server/etcdserver/apply/apply.go

Lines 459 to 463 in d92d37b

    
           ok := a.q.Available(rt) 
        
           resp, trace, err := a.applierV3.Txn(ctx, rt) 
        
           if err == nil && !ok { 
        
           	err = errors.ErrNoSpace 
        
           }

No matter which check fails, it generates a NOSPACE alarm,

etcd/server/etcdserver/server.go

Lines 1952 to 1956 in d92d37b

    
           a := &pb.AlarmRequest{ 
        
           	MemberID: uint64(s.MemberId()), 
        
           	Action:   pb.AlarmRequest_ACTIVATE, 
        
           	Alarm:    pb.AlarmType_NOSPACE, 
        
           }

I think we should

either update the doc to clearly clarify this.
or remove the quota check on the applying path. We only perform the quota check on the API layer.

seebs · 2023-10-02T16:48:51Z

Intuitively, if the quota check is going to generate an alarm, I'd expect the error return to happen without trying the Txn.

serathius · 2023-10-27T10:01:56Z

or remove the quota check on the applying path. We only perform the quota check on the API layer.

Don't agree with that. Quota check on the apply path is the important one.

jmhbnz · 2023-11-07T02:59:59Z

Hey @ahrtr - Where in the documentation would you suggest we outline this behavior? This issue could be a good candidate for tomorrow's ContribFest if we can nail down where we want to document the behavior 🙏🏻

ahrtr · 2023-11-07T09:54:47Z

Hey @ahrtr - Where in the documentation would you suggest we outline this behavior?

I would suggest to get it clearly clarified in https://etcd.io/docs/v3.5/op-guide/maintenance/

This issue could be a good candidate for tomorrow's ContribFest

Sounds good .

stale · 2024-03-17T12:58:01Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

jmhbnz · 2024-05-09T18:11:36Z

Discussed during sig-etcd triage meeting, this got missed at last contribfest so is still a valid issue.

/assign @siyuanfoundation

siyuanfoundation · 2024-05-24T16:07:35Z

added the case to documentation. Closing the issue.

seebs added the type/bug label Sep 27, 2023

ahrtr added area/documentation type/bug and removed type/bug labels Sep 28, 2023

stale bot added the stale label Mar 17, 2024

siyuanfoundation self-assigned this May 9, 2024

stale bot removed the stale label May 9, 2024

siyuanfoundation mentioned this issue May 23, 2024

Add doc of write success with no space error returned. etcd-io/website#856

Merged

siyuanfoundation closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etcd client can report error after a successful write #16659

etcd client can report error after a successful write #16659

seebs commented Sep 27, 2023 •

edited

Loading

serathius commented Sep 28, 2023

tjungblu commented Sep 28, 2023

ahrtr commented Sep 28, 2023 •

edited

Loading

seebs commented Oct 2, 2023

serathius commented Oct 27, 2023

jmhbnz commented Nov 7, 2023

ahrtr commented Nov 7, 2023

stale bot commented Mar 17, 2024

jmhbnz commented May 9, 2024

siyuanfoundation commented May 24, 2024

etcd client can report error after a successful write #16659

etcd client can report error after a successful write #16659

Comments

seebs commented Sep 27, 2023 • edited Loading

Bug report criteria

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Etcd version (please run commands below)

Etcd configuration (command line flags or environment variables)

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

Relevant log output

serathius commented Sep 28, 2023

tjungblu commented Sep 28, 2023

ahrtr commented Sep 28, 2023 • edited Loading

seebs commented Oct 2, 2023

serathius commented Oct 27, 2023

jmhbnz commented Nov 7, 2023

ahrtr commented Nov 7, 2023

stale bot commented Mar 17, 2024

jmhbnz commented May 9, 2024

siyuanfoundation commented May 24, 2024

seebs commented Sep 27, 2023 •

edited

Loading

ahrtr commented Sep 28, 2023 •

edited

Loading