Skip to content

Docs: add a design document for the Client #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
305 changes: 305 additions & 0 deletions docs/client-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,305 @@
# Tntcxx Client Design

## Scope

This document describes the design of the Tntcxx Client. First we state the
requirements and use cases we have in mind, and then we propose a design that
fulfills them.

## Requirements

### Functional Requirements

We envision Tntcxx to be used primarily as a part of network applications (e.g.,
an HTTP server), the Tntcxx Client backing requests to Tarantool. Such
applications are usually built around a central event processing loop, which is
responsible for multiplexing network I/O.

#### Application Event Processing Loop Integration

The first and foremost requirement is that there must be a convenient
way to integrate the Tntcxx Client into the event processing loop used by the
application. Moreover, the Tntcxx Client must never run the event processing
loop itself.

At the same time, since event processing loops are inherently single threaded,
we do expect the Tntcxx Client to be used in a multithreaded environment, i.e.,
when connections and reqeust futures are in different threads. So we do not aim
to design the Tntcxx Client to be thread-safe.

#### Asynchronous Request Processing

Since the Tntcxx Client is constrained from running the event processing loop,
the Tntcxx Client must support asynchronous request processing through
application-provided callbacks or futures.

#### Connection State

The application must be able to check the state of a Tntcxx Client.

#### Connection Error Handling

There must be a convenient way for the application to handle errors arising
throughout the Tntcxx Client lifecycle. A connection error must be returned
through the request callback and through the request object.

#### Request Handling

In order for the application to be able to manage a request, a request object is
always returned. The application can check the request status, cancel the
request, handle request errors and retrieve the response through this handle
(only once). However, if the response was retrieved by other means (either
returned through a callback or collected through scatter-gather), the handle
cannot return the response.

#### Request Status

The application must be able to check the status of a Tntcxx Client request.

#### Request Timeout

Since the Tntcxx Client does not have control over the application's event
processing loop, the application must implement its own request timeouts. The
application can cancel stale requests.

#### Request Cancelling

The application must be able to cancel a Tntcxx Client request. Cancelling a
request explicitly ends the lifetime of the corresponding response.

#### Request Retrying

Since the Tntcxx Client does not have control over the application's event
processing loop, the application must implement its own request retrying.

#### Request Fan-Out

A common Tntcxx Client use case is fan-out to multiple Tarantool instances, and
collection of responses received after some deadline, and discarding of requests
that are not ready by the deadline.

#### Response Lifetime

The response lifetime is managed implicitly through the lifetime of the request.
The response is not copyable. It can be retrieved only once, and the response
ownership is moved to the application.

#### Reconnection

The Tntcxx Client must support implicit reconnection with the same session
settings it was created with.

#### Connection Pool

TBD.

#### Transactions

TBD.

#### Failover

TBD.

## Design

### I/O Event Providers

```c++
/** Callback called on a read event. */
using read_ready_cb_f = void (*)(int fd, Data *data);
/** Callback called on a write event. */
using write_ready_cb_f = void (*)(int fd, Data *data);

/**
* An I/O event provider encapsulates the notification about events for a
* collection of file descriptors.
*
* `Data` is an opaque context type passed to the notification callbacks.
*/
template<class Data *>
class IOEventProvider {
public:
/**
* Register a file descriptor. Returns 0 on success, -1 on failure.
*/
int register(int fd, Data *data, read_ready_cb_f read_ready_cb,
write_ready_cb_f write_ready_cb);

/**
* Unregister a file descriptor. Returns 0 on success, -1 on failure.
*/
int unregister(int fd);
};
```

#### Epoll

```c++
/**
* Since `epoll` does not have a facility for storing callbacks, we delegate
* calling the notification callbacks to the application.
*
* In order to distinguish between application file descriptors and tntcxx
* sockets, we provide a wrapper class around the `Data`, which should be passed
* as the `ptr` argument of `epoll_data` to `epoll` by the application for its
* own file descriptors.
*/
template<class Data *>
class EpollIOEventProviderData {
public:
EpollIOEventProviderData(int type, Data *data);

/**
* Type of file descriptor, needed for distinguishing application file
* descriptors from tntcxx sockets.
*/
int type;
Data *data;
};

/**
* Encapsulates calling of notification callbacks from epoll for tntcxx.
*/
template<class Data *>
class EpollIoEventProviderDataTntcxx : public EpollIOEventProviderData<Data *> {
public:
EpollIoEventProviderData(int fd, Data *data, read_ready_cb_f read_ready_cb, write_ready_cb_f write_ready_cb);

/** Needs to be called by the application on a read event. */
void read_ready();
/** Needs to be called by the application on a write event. */
void write_ready();
};
```

### Connections

```c++
/**
* A connection encapsulates sending requests and receiving responses from one
* Tarantool instance.
*/
template<class IOEventProvider>
class Connection {
Copy link
Collaborator

@drewdzzz drewdzzz Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we have different classes for a plain connection and SSL connection?

Copy link
Member Author

@CuriousGeorgiy CuriousGeorgiy Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot about this detail. I guess we could either add a StreamProvider template parameter or we could use dynamic polymorphism. In the first case the user won't be able to store different connections in one container. Not sure what's best here. @alyapunov

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do both if we add a StreamProvider. In general overview:

  1. UnixPlainStream and UnixSSLStream for users who doesn't need both SSL and Plain streams at once.
  2. New UnixDynamicStream for others - there is already transport field in ConnectOpts, we will just check it in connect, send and recv.

Thus, only users who need both connection types pay the performance price.

public:
Connection(IOEventProvider &net_provider,
const ConnectionOptions &connection_options);

/**
* Return the connection's state.
*/
enum ConnectionState get_state() const;

/**
* If the connection is an erroneous state (see `status`), return the
* connection error that caused it. The same error is also passed to
* response callbacks, and the same error is returned by
* `Request::get_error`.
*/
std::optional<ConnectionError> &get_error() & const;

/* An abstract request's interface. A request object is always returned. */
Request some_request(/* options */);
};
```

#### Connection State

```c++
enum ConnectionState {
CONNECTION_INITIAL = 0,
CONNECTION_AUTH = 1,
CONNECTION_ACTIVE = 2,
CONNECTION_ERROR = 3,
CONNECTION_ERROR_RECONNECT = 4,
};
```

#### Connect Options

```c++
/* Extend `ConnectionOptions` with an option for the reconnection feature. */
struct ConnectOptions {
/* All existing options. */

/**
* In the event of a broken connection, the interval in which the stream
* tries to re-establish the connection.
*/
static constexpr size_t DEFAULT_RECONNECTION_INTERVAL = 2;
size_t reconnection_interval = DEFAULT_RECONNECTION_INTERVAL;
};
```

### Requests

```c++
/** Encapsulates management of a request issued through a connection. */
class Request {
public:
/**
* Callback called when response is ready. Since the callback has a fixed
* signature, we need to allow for capturing additional context using
* lambdas. Hence, we use `std::function` for type erasure.
*
* See `Connection::get_error` for details about the `error` parameter.
*/
using request_cb_f =
std::function<void(Request &&request, Connection &connection)>;

/** Set a callback called when the response is ready. */
void set_callback(request_cb_f request_cb) &&;

/** Return the request's status. */
enum RequestStatus get_status();

/**
* Return a connection error, if any. See `Connection::get_error` for
* details.
*/
std::optional<ConnectionError> &get_error() & const;

/**
* Cancel the request, ending the lifetime of the corresponding response.
*/
void cancel();

/**
* Return the response, if any. A response is available, iff:
* 1. The response has been received by the connection.
* 2. The response has not been dispatched to a callback.
* 3. The response has not already been retrieved previously.
*/
std::optional<Response> get_response();
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could create extract-once class:

  1. add isReady method
  2. set_callback should be rvalue method
  3. get_response -> get_result (result is actually variant<Error, Response>) - rvalue method
  4. reset_callback will be dropped. Anyway, I don't see any use cases for it.

Then, user cannot get response after callback is set and vice versa.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. add isReady method

I don't understand, why we do need this method, if we have get_status, which we could check for request readiness?

  1. set_callback should be rvalue method

This sounds like a cool solution. I suggest we keep the request object and pass it to the callback later on.

  1. get_response -> get_result (result is actually variant<Error, Response>) - rvalue method

I don't like this, since it would be nice to share the ConnectionError object between the connection and the requests (a variant does not allow to store references).

  1. reset_callback will be dropped. Anyway, I don't see any use cases for it.

Initially, I thought there might be some conditions when an application would want to remove the callback. But now I come to think that it is indeed redundant.

Copy link
Collaborator

@drewdzzz drewdzzz Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could store a pointer to an error in variant instead of reference - it is private anyway.

Yeah, isReady is not needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, you cannot do anything to the Request when callback is set, but you can access it when response is taken.

I won't resist to current design, but we could do the same for get_response (Result is needed for this purpose) to make Request completely safe. I'd leave it up to @alyapunov.

```

#### Request Status

```c++
enum RequestStatus {
REQUEST_SUCCESS = 0,
REQUEST_ERROR = -1,
REQUEST_IN_PROGRESS = 1,
};
```

#### Request Fan-Out

```c++
/**
* A fan-out encapsulates the collection of responses for a collection of
* requests.
*/
class FanOut {
template<class InputIt>
FanOut(MoveIt first, MoveIt last)

/**
* Return a list of ready request, and cancel the requests for which a
* response is not available.
*/
tnt::List<Request> collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this solution of myTarget's problem? If so:

  1. You mustn't cancel requests here.
  2. Passed requests must be dynamic: you poll for ready requests, then you create a bunch of new requests and poll them along with the previous ones.
  3. From the previous point - you must allow to populate the list of requests.
  4. Need to come up with a better name - it is a tool polling for ready request, not for "FanOut" requests.

P.S. The idea was to create something like epoll - you ask for ready requests and get only them (no need to scan for all requests and find ready ones), but all not ready requests must be still available.

Copy link
Member Author

@CuriousGeorgiy CuriousGeorgiy Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I still don't understand the problem statement. AFAIC, there is no dynamic scenario, we just want to create a bunch of synchronization points (i.e., collect points).

In the scenario you propose the mapping between the business logic and the responses is not clear. Let's say I have requests about a VK and Mail.ru profile mixed up (which correspond to 2 different Tarantool instances and 2 different connections). After collecting the mixed up responses, how I am going to understand what the sources of the requests were?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll explain my understanding from the beginning.

We have myTarget's problem - collect as much information as we can in a limited amount of time, for example, 1 second. We have only anonymous user id, so, firstly we send a bunch of requests to Tarantool to get all actions with this id (email authorization, visiting book page on litres and so on). When we have received data from mail.ru email, we can send a new bunch of requests to Tarantool storing mail.ru data (name, surname, age, etc). When we have received data from litres with user's books ids, we can send another bunch to Tarantool storing litres data (we want to find book's genre by its id). After 1 second all the requests are canceled, and myTarget knows that the guy with session_id = 143 is actually Ivan Ivanov who is keen on Steven King books.

Firstly, I just can't implement such service with this FanOut. For example, I call collect after 0.1s, and no responses are ready - the job has died even before it started. So, we must not cancel requests that are not ready.

Secondly, I have many different request types (search by anonymous id, mail.ru, litres and so on). I see only two solutions:

  1. We will have a lot of FanOuts - one per request type for every job (collecting info by id): N*M combinators.
  2. We will have something like epoll - the combinator will associate a value with every request. It can be void * or a template parameter of the combinator. Then, every job will wake up every 0.05s, for example, and call single collect method for all request types. Here we will use only N combinators.

I would prefer the second variant.

Thirdly, let's imagine that we have access to Tarantool storing Delivery Club order history. Now myTarget knows my user id in Delivery Club and wants to fetch all my orders. By user id it fetches a lot of reponses with order id. Then, by order id, it fetches information about food to find out about my food preferences. If we decide to have a FanOut per request type, we will need an opportunity to append new requests resolving order_id into its contents - collection of requests of this type is filled dynamically.

};
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About any, all, waitFor.

I don't see how we can implement them without access to the event loop.

When IOEventProvider fires the connector callback, it reads the data from socket and then places it somewhere (creating a Response).
At this moment, we could fire combinator any, associated with the request, or bump and check counter for combinator all.

What's about waitFor, we could just check deadlines for all combinators on every callback from IOEventProvider.

Our combinators will be faster then user-created ones, since we can manipulate with intrusive lists of Requests under the hood, and user will have to use callbacks.

Copy link
Member Author

@CuriousGeorgiy CuriousGeorgiy Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's about waitFor, we could just check deadlines for all combinators on every callback from IOEventProvider.

I believe we discussed that we don't want to implement timers, because the user has more precise time information than we do.

At this moment, we could fire combinator any, associated with the request, or bump and check counter for combinator all.

What interface do you propose for these combinators? It seems like this will create a lot of additional overhead (i.e., the connection objects will have to manage the combinator objects).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see many overhead. There will be a snippet somewhere in the connector:

if (req.hasCallback())
    req.cb();
else
    req.setResponse();

With combinator, it will turn into switch case:

switch (req.ret_type) {
    case COMBINATOR:
        /** Can be one function - is separated for clarity. */
        combinators[req.combinator_id]->list.insert(req);
        combinators[req.combinator_id]->checkReady();
    case CALLBACK:
        req.cb();
    case RAW:
        req.setResponse();
}

Combinator is just a list of ready requests with methods setCallback, isReady, getResponse - just like an usual request, but Response can be an array of responses (for all combinator, for example).

There will be no overhead for users who don't use combinators.

What's about waitFor, let's drop it - it can be implemented later if it will be needed - we will need to check time on every response.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for me, we could design efficient combinators while we are here (they are efficient because they use intrusive lists instead of callbacks). But it's just my opinion, so I can leave this decision to @alyapunov.