-
You don't need to use multiple threads (but you are welcome to). The nice thing about NIO is that it does all multi-threading needed to handle I/O under the covers for you.
-
Think event/action protocol like we see in class and in the slides. NIO's handleMessage methods are perfectly suited to that type of design. Write down your protocol carefully on paper first in event/action form and "debug" that first. It's a lot easier than debugging code. There is no easy way to debug a distributed system using a debugger in the traditional way as concurrency effects are not easily reproducible, i.e., a debugger ends up changing the very thing you are trying to see. (Nevertheless, systematically using break points within a debugger can still be useful compared to relying on console output alone.)
-
Play with the Grader yourself, i.e., increase the number of requests, threads, servers, etc. to stress test your system yourself. As always, the tests provided to you are a good baseline, i.e., if your code passes those tests, it is likely correct. But tests are never meant to be exhaustive, e.g., we may test your code with more stress or on different physical machines, etc. So it is important for your underlying design to be correct.
-
Do not use System.currentTimeMillis() or System.nanoTime() or such. You can not assume a globally synchronized clock in a distributed system. Your code should work correctly even if we test it on different physical machines.
-
You can rest assured that there will be no process failures in the test environment. The assignment is only testing consistency, not fault tolerance.
-
You can NOT assume FIFO delivery, i.e., you can not assume that
handleMessage(m1)will be called beforehandleMessage(m2)at a given receiving node even if m1 was sent before m2 and both were sent by the same sender. This is true even though we use TCP because of the multithreaded nature of the underlying nio library. -
handleMessage(m1)is not guaranteed to be isolated fromhandleMessage(m2)for two messagesm1andm2, i.e., their execution may be concurrent (again because the underlying NIO library is multi-threaded), so you must ensure that your replicated server implementation is thread-safe. -
Design requirement 6 on the response conveying globally committed semantics will be tested only using
callbackSend. Note the simpleCliend.sendcan not be modified, so it is not possible for the entry server to correctly send back a committed response for requests sent via that simple send. -
As already indicated in the strawman ReplicatedServer.java:74 (as well as in Part 1), you need to stop
serverMessengerand any objects with runnable components that you create, otherwiseGraderwon't terminate and the autograder will time out. -
Gradermay sometimes fail a test because a write may have propagated in your protocol to some servers but not others, so if the output logs clearly show that that is the problem and increasingGrader:SLEEPfixes it, such test fails are false positives and not a problem with your protocol.- With the modified
Grader:verifyOrderConsistentin the latest commit, this tip is no longer necessary.
- With the modified
-
Your protocol must not arbitrarily lose some requests as there is no reason for message loss in the assignment, so the implicit expectation is that if, say, 100 writes are sent, 100 writes must get completed eventually, i.e., after a sufficiently long time of waiting for the protocol to propagate all writes to all servers (otherwise a trivial, unacceptable way of ensuring totally ordered writes would be to do no writes at all).
-
If you contact us with any Gradesccope issue, please start your message with "I have verified that my code consistently passes all
Gradertests and terminates gracefully on my local machine...", otherwise you are still developing, so it's too soon to go to Gradescope as it is simply not a good environment to test or debug your design or code.
(More to be added if/as needed.)