Skip to content

Conversation

@gbrooker
Copy link
Contributor

@gbrooker gbrooker commented Nov 11, 2018

Split out from PR #59

This PR contains a number of fixes to Server

  • The continueRunning flag is created on the main queue, but checked on the global queue. The fix makes the flag thread safe by using a DispatchSemaphore in the getter and setter
  • The logger is created on the main thread, but is also called on the global thread. The fix is execute the log call on the main thread asynchronously
  • Calling close on a Socket while it is listening, resulted in a race condition in BlueSocket. The fix extends Socket to add a forceQuit method which shuts down the socket without any internal checks.
  • The HAP Server implementation assumes that the start() and stop() functions are called from the main queue. In the same spirit as the many guard statements throughout HAP, a dispatchPrecondition has been added to help developers using the library to ensure they respect this assumption.
  • The public stop() was being used both for external shutdown of the server and for internal cleanup. The cleanup part has been moved to tearDownConnections(), which is called only from the global queue, after continueRunning has been set to false. It stops advertising the server on mDNS and it force closes all open sockets. The public stop() method is called from the main thread and simply sets the continueRunning flag to false and force closes the main server socket, causing the previously mentioned cleanup to occur on the global thread.

@Bouke
Copy link
Owner

Bouke commented Nov 12, 2018

Great input! I somehow would've expected more threading issues to have come up, but maybe we still need to run into those. 😅

The continueRunning flag is created on the main queue, but checked on the global queue.

Is this an issue? I would guess that setting a boolean is an atomic operation that would not cause any race conditions.

Calling close on a Socket while it is listening, resulted in a race condition in BlueSocket. The fix extends Socket to add a forceQuit method which shuts down the socket without any internal checks.

Is this caused because we're accepting connections on one queue, but stopping from another. On first sight, it seems that this is an upstream bug in BlueSocket. It would be good to notify them of any issues with their code. Also, they're open to merging PRs as I've fixed some UDP multicast issue in the past as well.

The public stop() method is called from the main thread and simply sets the continueRunning flag to false and force closes the main server socket, causing the previously mentioned cleanup to occur on the global thread.

So my understanding is that because acceptClientConnection() is a blocking call, we must somehow signal the socket to stop accepting clients. But because of the race condition mentioned before, we cannot simply call close()?

@gbrooker
Copy link
Contributor Author

The continueRunning flag is created on the main queue, but checked on the global queue.

Is this an issue? I would guess that setting a boolean is an atomic operation that would not cause any race conditions.

Unfortunately up to Swift 4.2, there are no atomic operations, not even for Bool types. That was a trade off that was made to keep the language fast, as guaranteed atomicity is a big overhead. To achieve atomicity we must use Locks, Semaphores or Queues to avoid race conditions.

Calling close on a Socket while it is listening, resulted in a race condition in BlueSocket. The fix extends Socket to add a forceQuit method which shuts down the socket without any internal checks.

Is this caused because we're accepting connections on one queue, but stopping from another. On first sight, it seems that this is an upstream bug in BlueSocket. It would be good to notify them of any issues with their code. Also, they're open to merging PRs as I've fixed some UDP multicast issue in the past as well.

Yes, the Socket.close() function is checking various flags which were set on another queue. I'll take a look at contributing to BlueSocket, but let's not hold up this PR for that.

The public stop() method is called from the main thread and simply sets the continueRunning flag to false and force closes the main server socket, causing the previously mentioned cleanup to occur on the global thread.

So my understanding is that because acceptClientConnection() is a blocking call, we must somehow signal the socket to stop accepting clients. But because of the race condition mentioned before, we cannot simply call close()?

Correct

@Bouke Bouke merged commit 9da6b6a into Bouke:master Nov 14, 2018
@Bouke
Copy link
Owner

Bouke commented Nov 14, 2018

Thank you for the explanation and your contribution of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants