Skip to content

Conversation

@kimolegan
Copy link

@kimolegan kimolegan commented Nov 7, 2017

Problem description.
Keep alive problems on the client side are detected. This is happening due to m_outstandingRequests count limited as 10 + subscription count

// limit the number of keep alives sent.
                if (OutstandingRequestCount > SubscriptionCount + 10)
                {
                    return;
                }

and new keep alive requests are not sent if limit reached.
Bug repeating algorithm: session timeout interval set to 1 hour, subscription publishing interval set to 10 sec, session keepalive interval set to 1 sec. After connection was established turn off network adapter. After 15 failed reconnection attempts there should be 15 "hanged" requests in m_outstandingRequests, now turn on network adapter. Reached limit of m_outstandingRequests will prevent sending addidtional keepalive requests after session reconnecting and right after reconnection keep alive problems detected again. Reconnection starts again and so forth.

Root cause is that during reconnection BeginPublish requests are sent

try
            {                
                IAsyncResult result = BeginPublish(
                    requestHeader, 
                    acknowledgementsToSend, 
                    OnPublishComplete, 
                    new object[] { SessionId, acknowledgementsToSend, requestHeader });
                    
                AsyncRequestStarted(result, requestHeader.RequestHandle, DataTypes.PublishRequest);

                // Utils.Trace("PUBLISH #{0} SENT", requestHeader.RequestHandle);

                return result;
            }
            catch (Exception e)
            {   
                Utils.Trace(e, "Unexpected error sending publish request.");
                return null;
            }

, then immediately callback is called in another thread, where AsyncRequestCompleted is called. At this moment AsyncRequestStarted was not called yet and therefore dummy AsyncRequestState is addded to m_outstandingRequests

// add a dummy placeholder since the begin request has not completed yet.
                if (state == null)
                {
                    state = new AsyncRequestState();

                    state.Defunct = true;
                    state.RequestId  = requestId;
                    state.RequestTypeId = typeId;
                    state.Result = result;
                    state.Timestamp = DateTime.UtcNow;

                    m_outstandingRequests.AddLast(state);
                }

. Then inside BeginPublish Exception "HostNotFound" is thrown. TcpCLientChannel.cs, line 139

try
                {
                    Socket.BeginConnect(m_via, m_ConnectCallback, operation);
                }
                catch (SocketException e)
                {
                    Shutdown(StatusCodes.Bad);
                    throw e;
                }        

As result AsyncRequestStarted method never reached and dummy request added in AsyncRequestCompleted never removed from m_outstandingRequests.

To solve the problem it is proposed not to register requests as dummy if AsynRequestCompleted is called earlier than AsyncRequestStarted but instead just skip any registering. IAsyncResult.IsCompleted flag is always set True before calling completion callback (where AsyncRequestCompleted is called), so racing condition is not possible in AsyncRequestStarted. Latest mentioned fact regarding IsCompleted flag is checked in TcpAsyncOperation.cs and AsyncResultBase.cs files.

P.S. Proposed solution was tested in real Ua client application successfully.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants