-
Notifications
You must be signed in to change notification settings - Fork 92
Architecture
The central part of the Cocaine architecture is the node. Each node is a cocaine-runtime
process with a host-unique configuration. Essentially, node is a collection of services and a service locator. A degenerate case is possible, where node is started without services — this is a gateway node, which only aggregates service information about the cluster and routes the clients.
Service is an actor, an RPC-enabled piece of code, which accepts a certain set of messages. Technically speaking, each service dispatches a service protocol — that is, a list of methods and their respective MessageID
s you can call by sending messages to the service just after a connection has been established. This protocol description can be dynamically obtained (along with other stuff) by resolving a service name via the locator.
The important part here is that, in line with the actor model, the client is an actor too. So, after you've sent a message to a service to do something for you, it responds by sending messages as well. But unlike server-side services with service-specific protocols, every client dispatches the streaming service protocol, mostly for backward compatibility and ease of use.
Each connection between a client and a service is multiplexed using ChannelID
s, and both ends of a given channel dispatch some specific, possibly different, protocols. For example, the usual session between a client and a service goes as follows:
- A client connects to some service and picks any channel at random (for example, channel #1), because all of them are not used in the beginning. Initially the service side of a channel dispatches the service-specific protocol, and the client side dispatches the streaming protocol.
- The client sends a message tagged with the chosen
ChannelID
in order to call one of the service's methods. That indicates the start of a session. - The service switches its side of the channel to the null protocol, so that the client couldn't call some other method in the same channel while the service processes you request.
- The client starts to receive the streaming protocol
Chunk
messages with the service response. - In the end, the service sends a
Choke
message to indicate that the session has been completed and switches its side of the channel back to the service-specific protocol. - If that was the only request, the client disconnects.
Note that some services provide streamable methods: in that case the service will switch to the streaming protocol instead of the null protocol, so that you can stream some data to the service.
When a node starts, it reads its configuration file, which has a list of services to run. This list only specifies service names and types, but not network-related properties, because the I/O layer and the RPC layer are completely separate. Moreover, the services themselves have no code to communicate over the network, only the message dispatching code.
In order to enable those services to receive and send messages over the network, the node starts a special service called the locator. Every other service is attached to the locator, which in turn wraps them in an event loop, binds them to some network endpoints and announces them in the cluster. The locator itself always runs on a well-known port.
So, a client should perform the following steps to connect to the requested service:
- Connect to service locator on a well-known port.
- Send a
Resolve
message with the name of the required service using any channel. - Receive a
Chunk
message with the information about the service endpoint, its protocol version and its dispatch maps (which is a mapping of message numbers to method names). - Receive a
Choke
message indicating that the request has been completed. - Connect to the specified endpoint and work with the requested service.
Services can stack protocols. For example, the Elliptics service implements both the generic storage protocol and its own specific protocol, which means that a client requesting storage service can be routed to the Elliptics service instance. That's fine, because stacking allows the client to work with the Elliptics instance without even knowing the service-specific protocol details — protocol messages have the same MessageID
s no matter what service implement the given protocol and whether it uses protocol stacking or not.
Optionally, the locator can be configured to aggregate other locators' multicast announces (or use a provided list of remote nodes) and act as a cluster entry point for clients. In other words, the aggregating locator job is to configure a gateway by connecting with all the remote nodes and monitoring their health and service updates.
Gateways are pluggable locator modules which provide remote location functionality. For example, a simple builtin Adhoc Gateway randomly picks a remote node for each client, and IPVS Gateway operates on a kernel IPVS load balancer to set up a local virtual service for each available service in the cluster.
Clients can use these aggregating locators to access every service in the cluster regardless of their physical location in a load-balanced fashion.