Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Duplicate blockId causes shuffle Server network to be full. #2119

Closed
2 of 3 tasks
yl09099 opened this issue Sep 13, 2024 · 9 comments · Fixed by #2124
Closed
2 of 3 tasks

[Bug] Duplicate blockId causes shuffle Server network to be full. #2119

yl09099 opened this issue Sep 13, 2024 · 9 comments · Fixed by #2124

Comments

@yl09099
Copy link
Contributor

yl09099 commented Sep 13, 2024

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

In Netty mode, if BlockId is duplicated, GetInMemoryShuffleData loops through the same BlockId, causing the single Shuffle Server machine network to be full and the Spark Application to be jammed.
The following is a log of reading the duplicate BlockId:
image

Affects Version(s)

0.10.0

Uniffle Server Log Output

No response

Uniffle Engine Log Output

No response

Uniffle Server Configurations

No response

Uniffle Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@yl09099
Copy link
Contributor Author

yl09099 commented Sep 13, 2024

@zuston @maobaolong @jerqi @xianjingfeng Have you encountered this problem when using it?

@yl09099
Copy link
Contributor Author

yl09099 commented Sep 13, 2024

In ShuffleServer,using ShuffleBufferWithLinkedList, Should the ShuffleBufferWithLinkedList class List Change to Set?

@jerqi
Copy link
Contributor

jerqi commented Sep 13, 2024

In ShuffleServer,using ShuffleBufferWithLinkedList, Should the ShuffleBufferWithLinkedList class List Change to Set?

BlockId should guanratee not change the order.

@yl09099
Copy link
Contributor Author

yl09099 commented Sep 13, 2024

In ShuffleServer,using ShuffleBufferWithLinkedList, Should the ShuffleBufferWithLinkedList class List Change to Set?

BlockId should guanratee not change the order.

I can't figure out what caused BlockId to repeat, I thought it was a client timeout at first, but I checked all the logs and there was no timeout.

@maobaolong
Copy link
Member

@lwllvyb Is it similar with our issue?

@xianjingfeng
Copy link
Member

In ShuffleServer,using ShuffleBufferWithLinkedList, Should the ShuffleBufferWithLinkedList class List Change to Set?

You can try ShuffleBufferWithSkipList

@yl09099
Copy link
Contributor Author

yl09099 commented Sep 14, 2024

In ShuffleServer,using ShuffleBufferWithLinkedList, Should the ShuffleBufferWithLinkedList class List Change to Set?

BlockId should guanratee not change the order.

Using LinkedHashSet ensures both de-duplication and order.

@jerqi
Copy link
Contributor

jerqi commented Sep 14, 2024

In ShuffleServer,using ShuffleBufferWithLinkedList, Should the ShuffleBufferWithLinkedList class List Change to Set?

BlockId should guanratee not change the order.

Using LinkedHashSet ensures both de-duplication and order.

Could you paste some more information about this? If it can ensure them, it's ok.

@yl09099
Copy link
Contributor Author

yl09099 commented Sep 17, 2024

Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.)
https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashSet.html

zuston pushed a commit that referenced this issue Sep 26, 2024
…shuffleBuffer (#2124)

### What changes were proposed in this pull request?

Fixed block duplication bug.

### What changes were proposed in this pull request?

Fix: #2119 

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UT.
maobaolong pushed a commit to maobaolong/incubator-uniffle that referenced this issue Nov 4, 2024
…fault shuffleBuffer (apache#2124)

### What changes were proposed in this pull request?

Fixed block duplication bug.

### What changes were proposed in this pull request?

Fix: apache#2119 

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing UT.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants