You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After pendingItem.done() is called, pendingItem.buffer that are holding in connection.writeRequestsCh should not be written to network by conneciton.internalWriteData()/conneciton.cnx.Write().
Actual behavior
When we send message, we pass pendingItem.buffer to connection.writeRequestsCh;
connection.writeRequestsCh is a buffered channel(writeRequestsCh: make(chan Buffer, 256)), which means the buffer will be written to network later/asnyc;
If pendingItem timeout happen before pendingItem.buffer being sent to network, pendingItem.buffer will be put back to the pool and realloced, but connection.writeRequestsCh still hold the reference of the OLD buffer;
Now, if we run to conneciton.internalWriteData()/conneciton.cnx.Write(), data race will happen, we intent to send the OLD pendingItem.buffer, but the buffer holds the data of a NEW pendingItem;
Because timeout is longer than sending, it hasn't happened in reality, but it is theoretically possible.
Steps to reproduce
Code review:
// writeData will pass `pendingItem.buffer` to `connection.writeRequestsCh`func (p*partitionProducer) writeData(buffer internal.Buffer, sequenceIDuint64, callbacks []interface{}) {
select {
case<-p.ctx.Done():
for_, cb:=rangecallbacks {
ifsr, ok:=cb.(*sendRequest); ok {
sr.done(nil, ErrProducerClosed)
}
}
returndefault:
now:=time.Now()
p.pendingQueue.Put(&pendingItem{
createdAt: now,
sentAt: now,
buffer: buffer,
sequenceID: sequenceID,
sendRequests: callbacks,
})
p._getConn().WriteData(buffer)
}
}
// `connection.writeRequestsCh` is a buffer channelfuncnewConnection(optsconnectionOptions) *connection {
cnx:=&connection{
// This channel is used to pass data from producers to the connection// go routine. It can become contended or blocking if we have multiple// partition produces writing on a single connection. In general it's// good to keep this above the number of partition producers assigned// to a single connection.writeRequestsCh: make(chanBuffer, 256),
}
returncnx
}
// `pendingItem.buffer` is held in `connection.writeRequestsCh` and will be sent in another goroutinefunc (c*connection) WriteData(dataBuffer) {
select {
casec.writeRequestsCh<-data:
// Channel is not fullreturndefault:
// Channel full, fallback to probe if connection is closed
}
for {
select {
casec.writeRequestsCh<-data:
// Successfully wrote on the channelreturncase<-time.After(100*time.Millisecond):
// The channel is either:// 1. blocked, in which case we need to wait until we have space// 2. the connection is already closed, then we need to bail outc.log.Debug("Couldn't write on connection channel immediately")
ifc.getState() !=connectionReady {
c.log.Debug("Connection was already closed")
return
}
}
}
}
// `pendingItem.buffer` will be written to network by c.internalWriteData(data)func (c*connection) run() {
pingSendTicker:=time.NewTicker(c.keepAliveInterval)
pingCheckTicker:=time.NewTicker(c.keepAliveInterval)
deferfunc() {
// stop tickerspingSendTicker.Stop()
pingCheckTicker.Stop()
// all the accesses to the pendingReqs should be happened in this run loop thread,// including the final cleanup, to avoid the issue// https://github.com/apache/pulsar-client-go/issues/239c.failPendingRequests(errConnectionClosed)
c.Close()
}()
// All reads come from the reader goroutinegoc.reader.readFromConnection()
goc.runPingCheck(pingCheckTicker)
c.log.Debugf("Connection run starting with request capacity=%d queued=%d",
cap(c.incomingRequestsCh), len(c.incomingRequestsCh))
for {
select {
case<-c.closeCh:
c.failLeftRequestsWhenClose()
returncasereq:=<-c.incomingRequestsCh:
ifreq==nil {
return// TODO: this never gonna be happen
}
c.internalSendRequest(req)
casecmd:=<-c.incomingCmdCh:
c.internalReceivedCommand(cmd.cmd, cmd.headersAndPayload)
casedata:=<-c.writeRequestsCh:
ifdata==nil {
return
}
c.internalWriteData(data)
case<-pingSendTicker.C:
c.sendPing()
}
}
}
// when pendingItem is done, its buffer will be put back to the pool and realloced laterfunc (i*pendingItem) done(errerror) {
ifi.isDone {
return
}
i.isDone=truebuffersPool.Put(i.buffer)
ifi.flushCallback!=nil {
i.flushCallback(err)
}
}
// when `pendingItem.done()` is called before `c.internalWriteData(data)`, data race will happen
System configuration
Pulsar version: x.y
The text was updated successfully, but these errors were encountered:
Expected behavior
After
pendingItem.done()
is called,pendingItem.buffer
that are holding inconnection.writeRequestsCh
should not be written to network byconneciton.internalWriteData()/conneciton.cnx.Write()
.Actual behavior
pendingItem.buffer
toconnection.writeRequestsCh
;connection.writeRequestsCh
is a buffered channel(writeRequestsCh: make(chan Buffer, 256)
), which means the buffer will be written to network later/asnyc;pendingItem
timeout happen beforependingItem.buffer
being sent to network,pendingItem.buffer
will be put back to the pool and realloced, butconnection.writeRequestsCh
still hold the reference of the OLD buffer;conneciton.internalWriteData()/conneciton.cnx.Write()
, data race will happen, we intent to send the OLDpendingItem.buffer
, but the buffer holds the data of a NEWpendingItem
;Steps to reproduce
Code review:
System configuration
Pulsar version: x.y
The text was updated successfully, but these errors were encountered: