-
Notifications
You must be signed in to change notification settings - Fork 979
RPC Protocol
Drill protocols:
(TBD)
Drill uses Netty for RPC. A good understanding of Netty is assumed here.
At the most basic level, Drill messages are of one of five "modes":
Message Mode Name | Number | Description |
---|---|---|
REQUEST | 0 | Request message sent from client or server |
RESPONSE | 1 | Response message returned server or client |
RESPONSE_FAILURE | 2 | Error response |
PING | 3 | Heartbeat message from client |
PONG | 4 | Heartbeat response from server |
In this table, a REQUEST
is sent from either end of the connection. A request invites a RESPONSE
(or a RESPONSE_FAILURE
). The PING
/PONG
messages are Drill's heartbeat keep-alive mechanism.
Drill messages (both REQUEST
and RESPONSE
) have three parts:
- Header, which includes the RPC type, and the lengths of the two body portions. The lengths are consumed internally, RPC type passed to the caller.
- A Protobuf ("pbody")
- Data as serialized value vectors ("dbody")
The bodies are optional: any given message may have only a pbody, only a dbody, neither or both. The type of content is determined by the RPC type as will be discussed later.
When receiving messages, Drill sets up a series of Netty handlers to receive and decode messages. The key handler is the RPCBus
which handles the "modes" mentioned above. RPCBus
handle the PING
/PONG
messages itself. All others must be handed over to other parts of Drill for processing.
The RPCBus
places the incoming message parts (RPC type, pbody and dbody) into a "Request" object: either RequestEvent
or ResponseEvent
Netty is asynchronous, so that message processing must be done on other than Netty threads. The RPCBus
handles Netty messages on the Netty thread, but immediately creates "Event" objects which are handed off (via a series of indirections) to a worker thread pulled from a thread pool.
The worker thread (via configuration) hands control over to the handle()
method on one of the BasicServer
implementations:
-
UserServer
: Handles communication with a Drill client. -
DataServer
: Handle data transfers. -
ControlServer
: Handles control signals between Drillbits.
The following sessions discuss each resulting application-level protocol.
Sent over a "data tunnel". The data tunnel protocol is very basic: just connect and send batches. There is no handshake, no setup and very little flow control.
The data channel uses the protobuf definitions from BitData.proto
. The mapping is:
RPC Type Name | Number | Protobuf Message |
---|---|---|
HANDSHAKE | 0 | BitClientHandshake, BitServerHandshake |
ACK | 1 | N/A |
GOODBYE | 2 | N/A |
REQ_RECORD_BATCH | 3 | FragmentRecordBatch |
Basic protocol:
Step | Upstream | Downstream |
---|---|---|
Connect | Connect | Accept Connection |
Heartbeat | sends PING | accepts PING |
accepts PONG | sends PONG | |
Send Batch | Send Batch 1 | Receive Batch 1 |
Send Batch 2 | Receive Batch 2 | |
Send Batch 3 | Receive Batch 3 | |
Flow Control | Wait | |
Receive OK | Send OK | |
Send | ... | ... |
Key steps:
- Open the data tunnel using
DataConnectorCreator
which creates holds a connectionDataServer
, one for each destination endpoint. - Batches are sent with ...
-
DataServer
handles incoming RPC messages using thehandle()
method. - The same
DataServer
processes incoming batches via thesubmit()
and ` - REQ_RECORD_BATCH
The FragmentRecordBatch
message is the key to the protocol. It has three key parts:
- A source address (query id, major fragment id, minor fragment id)
- A destination address (query id, major fragment id, minor fragment id)
- The actual record batch
- Flag to indicate if the batch is the last batch.