-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Requirements
- Support multiple clients on same channel. Same channel is defined by same EndpointConfiguration. Today the clients close channels unless explicitly requested not to. However, the channels are shared and the design is brittle.
- Current behavior of one client per channel should be retained.
- Clients can trigger reconnect of channels which trigger other clients to run through their reconnect sequence (e.g session reactivation). Multiple reconnect triggers should coalesce into 1 reconnect, not many.
- Clients are notified when the channel is disconnected/reconnected to manage internal state better.
- Simplify the API to not require managing of Attach/Detach channel. Instead, channels are managed centrally.
- Make reconnect handling transparent to the user. The session KA workers shall reconnect directly, Channel Retry policies should be configurable.
Design
This builds on #3282. We pass 1 additional param during client creation: the new channel manager interface. The channel manager will be expanded to add ref counting and a async state change event to the channels interface (extended interface, maybe IManagedTransportChannel).
Then there are the following scenarios:
- Connect (create), obtain channel from channel manager, use.
- Reconnect - call reconnect on the channel. Channel manager will tear it down and reconnect and signal to any registered client to re-activate (or just dont care) if requiring a new connection.
- Close - release ref count and close if needed.
- Channel disconnected - same as 2, signal to current owner that the channel is disconnected and then reconnected - client can re-activate if needed.
- Recreate - recreate a client based on another "closed client"
Probably 2 and 4 have the most overlap. 5 is basically 1. However, recreate could just be the reconnect sequence falling back to CreateSession if ActivateSession fails on provided channel.
A channel identifier would be based on EndpointConfiguration (TBD) from which we create a strong hash. Clients obtain a IManagedTransportChannel for an EndpointConfiguration and a callback interface (which will provide an identifier for logging/obs). When closing the channel, the channel remains open until the last client has returned it.
The callback interface could reuse IClientBase and a new IClientChannelManager
public interface IClientBase
{
... existing members ... - which includes EndpointConfiguration
string Id { get; }
ValueTask<bool> OnReconnectAsync(IManagedChannel channel, int reconnectAttempt, CancellationToken ct);
}
public interface IClientChannelManager
{
... changed ...
ValueTask<IManagedTransportChannel> GetAsync(IClientBase client, CancellationToken ct);
ValueTask ReconnectAsync(EndpointConfiguration configuration, CancellationToken ct);
// Used for example by certificate manager/store on update
ValueTask ReconnectAllAsync(CancellationToken ct);
}
For retry policies we want to use resilience strategies e.g. Polly at the channel level. For Http client channels we want to use Microsoft.Extension.Http|.Resilience|.Diagnostic instead of creating and managing a default .net HttpClient, then obtain HttpClient from IHttpClientFactory, which can be customized with appropriate resiliance, logging, etc. For Tcp Channels we want to follow a similar pattern via ITcpClientChannelBinding interface with client channels that can be created with alternative resilience and logging strategies via DI. The connect/reconnect will be wrapped with resilience and "UntilCancelled" strategies whereby the default will mimic current ReconnectHandler exponential retry. Cancellation will happen via CancellationToken with each client (IClientbase) having a CT property that will be signaled on close and stop connectivity resilience.
Care must be taken to ensure clients are blocked sending before the reconnect sequence has been executed, that means, no requests should be possible while the client channel manager reconnects. The channel will block all requests to SendRequestAsync except for Discovery and session services until all reconnect sequences were run. All reconnect sequences will be run in parallel, i.e. the clients will be notified in parallel via Task.WhenAll. If any client fails to run the connect sequence the reconnect will fail, and a new channel will be created and the reconnect sequence run again. This will happen until clients indicate they want to close. Same or alternative retry policy will apply. We will log all errors and emit metrics.
ClientBase.AttachChannel/DetachChannel would go away with no replacement (obsoleted but no op). Reconnect can be triggered by session keep alive or any user via the ClientBase API "ReconnectAsync". This will reconnect / reecreate the underlying channel wrapped in IManagedTransportChannel. ReconnectHandler will also be obsoleted and become a "stub".