Fix race conditions in Server #64

gbrooker · 2018-11-11T03:41:15Z

Split out from PR #59

This PR contains a number of fixes to Server

The continueRunning flag is created on the main queue, but checked on the global queue. The fix makes the flag thread safe by using a DispatchSemaphore in the getter and setter
The logger is created on the main thread, but is also called on the global thread. The fix is execute the log call on the main thread asynchronously
Calling close on a Socket while it is listening, resulted in a race condition in BlueSocket. The fix extends Socket to add a forceQuit method which shuts down the socket without any internal checks.
The HAP Server implementation assumes that the start() and stop() functions are called from the main queue. In the same spirit as the many guard statements throughout HAP, a dispatchPrecondition has been added to help developers using the library to ensure they respect this assumption.
The public stop() was being used both for external shutdown of the server and for internal cleanup. The cleanup part has been moved to tearDownConnections(), which is called only from the global queue, after continueRunning has been set to false. It stops advertising the server on mDNS and it force closes all open sockets. The public stop() method is called from the main thread and simply sets the continueRunning flag to false and force closes the main server socket, causing the previously mentioned cleanup to occur on the global thread.

Bouke · 2018-11-12T21:33:07Z

Great input! I somehow would've expected more threading issues to have come up, but maybe we still need to run into those. 😅

The continueRunning flag is created on the main queue, but checked on the global queue.

Is this an issue? I would guess that setting a boolean is an atomic operation that would not cause any race conditions.

Calling close on a Socket while it is listening, resulted in a race condition in BlueSocket. The fix extends Socket to add a forceQuit method which shuts down the socket without any internal checks.

Is this caused because we're accepting connections on one queue, but stopping from another. On first sight, it seems that this is an upstream bug in BlueSocket. It would be good to notify them of any issues with their code. Also, they're open to merging PRs as I've fixed some UDP multicast issue in the past as well.

The public stop() method is called from the main thread and simply sets the continueRunning flag to false and force closes the main server socket, causing the previously mentioned cleanup to occur on the global thread.

So my understanding is that because acceptClientConnection() is a blocking call, we must somehow signal the socket to stop accepting clients. But because of the race condition mentioned before, we cannot simply call close()?

gbrooker · 2018-11-13T13:44:53Z

The continueRunning flag is created on the main queue, but checked on the global queue.

Is this an issue? I would guess that setting a boolean is an atomic operation that would not cause any race conditions.

Unfortunately up to Swift 4.2, there are no atomic operations, not even for Bool types. That was a trade off that was made to keep the language fast, as guaranteed atomicity is a big overhead. To achieve atomicity we must use Locks, Semaphores or Queues to avoid race conditions.

Calling close on a Socket while it is listening, resulted in a race condition in BlueSocket. The fix extends Socket to add a forceQuit method which shuts down the socket without any internal checks.

Is this caused because we're accepting connections on one queue, but stopping from another. On first sight, it seems that this is an upstream bug in BlueSocket. It would be good to notify them of any issues with their code. Also, they're open to merging PRs as I've fixed some UDP multicast issue in the past as well.

Yes, the Socket.close() function is checking various flags which were set on another queue. I'll take a look at contributing to BlueSocket, but let's not hold up this PR for that.

The public stop() method is called from the main thread and simply sets the continueRunning flag to false and force closes the main server socket, causing the previously mentioned cleanup to occur on the global thread.

So my understanding is that because acceptClientConnection() is a blocking call, we must somehow signal the socket to stop accepting clients. But because of the race condition mentioned before, we cannot simply call close()?

Correct

Bouke · 2018-11-14T20:10:34Z

Thank you for the explanation and your contribution of course.

Fix race conditions in Server

903d4f5

This was referenced Nov 11, 2018

November fixes #59

Closed

Observe Characteristic changes and HAP getValues #55

Open

Bouke added 3 commits November 14, 2018 20:53

Remove Xcode file header

3e80e4d

Indent inner scope of conditionals

c2b32ce

Note about force closing sockets

883e986

Bouke merged commit 9da6b6a into Bouke:master Nov 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix race conditions in Server #64

Fix race conditions in Server #64

Uh oh!

gbrooker commented Nov 11, 2018 •

edited

Loading

Uh oh!

Bouke commented Nov 12, 2018

Uh oh!

gbrooker commented Nov 13, 2018

Uh oh!

Bouke commented Nov 14, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix race conditions in Server #64

Fix race conditions in Server #64

Uh oh!

Conversation

gbrooker commented Nov 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Bouke commented Nov 12, 2018

Uh oh!

gbrooker commented Nov 13, 2018

Uh oh!

Bouke commented Nov 14, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gbrooker commented Nov 11, 2018 •

edited

Loading