Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Potential data race #1338

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Fix: Potential data race #1338

wants to merge 7 commits into from

Conversation

gunli
Copy link
Contributor

@gunli gunli commented Feb 24, 2025

Fixes #1337

(or if this PR is one task of a github issue, please add Master Issue: #<xyz> to link to the master issue.)

Master Issue: #1337

Motivation

Fix potential data race

Modifications

  1. Add a context and a cancel func to pendingItem;
  2. In pendingItem.done(), cancel the context;
  3. If context is done, stop writing.

Verifying this change

  • Make sure that the change passes the CI checks.

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

No

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable / docs / GoDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@gunli
Copy link
Contributor Author

gunli commented Feb 24, 2025

This bug has been discussed in #1333, and this PR should be merged after #1333 and #1336 , may be rebase is required.
@shibd @RobertIndie @nodece @BewareMyPower @crossoverJie

Copy link
Member

@nodece nodece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

shibd

This comment was marked as resolved.

@shibd shibd dismissed their stale review February 26, 2025 14:07

no same issue

Copy link
Member

@nodece nodece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not check the cnx is nil, this is incorrect, I think the previous commit is correct, I know the CI failed.

Please check https://github.com/apache/pulsar-client-go/actions/runs/13538441376/job/37836739015

2025-02-26T08:20:54.6856030Z     producer_test.go:475: 
2025-02-26T08:20:54.6856682Z         	Error Trace:	/pulsar/pulsar-client-go/pulsar/producer_test.go:475
2025-02-26T08:20:54.6857513Z         	Error:      	Expected nil, but got: &errors.errorString{s:"request timed out"}
2025-02-26T08:20:54.6858292Z         	Test:       	TestFlushInPartitionedProducer
2025-02-26T08:20:54.6858745Z --- FAIL: TestFlushInPartitionedProducer (30.02s)
2025-02-26T08:20:54.6859183Z panic: runtime error: invalid memory address or nil pointer dereference [recovered]
2025-02-26T08:20:54.6859909Z 	panic: runtime error: invalid memory address or nil pointer dereference
2025-02-26T08:20:54.6860253Z [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x12c4163]
2025-02-26T08:20:54.6860265Z 
2025-02-26T08:20:54.6860426Z goroutine 14599 [running]:
2025-02-26T08:20:54.6860710Z testing.tRunner.func1.2({0x1505d80, 0x1ef9000})
2025-02-26T08:20:54.6860964Z 	/pulsar/go/src/testing/testing.go:1632 +0x3fc
2025-02-26T08:20:54.6861127Z testing.tRunner.func1()
2025-02-26T08:20:54.6861355Z 	/pulsar/go/src/testing/testing.go:1635 +0x6b6
2025-02-26T08:20:54.6861512Z panic({0x1505d80?, 0x1ef9000?})
2025-02-26T08:20:54.6861720Z 	/pulsar/go/src/runtime/panic.go:785 +0x132
2025-02-26T08:20:54.6862233Z github.com/apache/pulsar-client-go/pulsar.TestFlushInPartitionedProducer(0xc000374000)
2025-02-26T08:20:54.6862565Z 	/pulsar/pulsar-client-go/pulsar/producer_test.go:476 +0x343
2025-02-26T08:20:54.6862751Z testing.tRunner(0xc000374000, 0x16a9940)
2025-02-26T08:20:54.6862985Z 	/pulsar/go/src/testing/testing.go:1690 +0x227
2025-02-26T08:20:54.6863352Z created by testing.(*T).Run in goroutine 1
2025-02-26T08:20:54.6863588Z 	/pulsar/go/src/testing/testing.go:1743 +0x826
2025-02-26T08:20:54.6863859Z FAIL	github.com/apache/pulsar-client-go/pulsar	285.026s

Your changes break this test, please fix test.

@gunli
Copy link
Contributor Author

gunli commented Feb 27, 2025

Your changes break this test, please remove the ctx check, and then fix test.

image

It pass on my host.

@nodece
Copy link
Member

nodece commented Feb 27, 2025

@gunli Please revert to bc26223.

It seems a network issue, the ci has been re-run to research the root cause.

@gunli
Copy link
Contributor Author

gunli commented Feb 27, 2025

@gunli Please revert to bc26223.

It seems a network issue, the ci has been re-run to research the root cause.

image

All tests pass on my host, may be we should set some timeout config in the test.

And I have reverted it.

@nodece nodece added this to the v0.15.0 milestone Mar 6, 2025
@gunli gunli marked this pull request as draft March 7, 2025 05:22
@gunli
Copy link
Contributor Author

gunli commented Mar 7, 2025

mark is as draft until #1336 is merged

@gunli gunli marked this pull request as ready for review March 7, 2025 07:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug][Producer/Connection] Potential data race
4 participants