Packet losses with RRCPDL under trep3.3.0 (linux rhel7) and local ATS setup

Hi dear developpers team and community, this if following case 07426946 where I was asked to post my question here.

I deployed a new ATS + ADH + ADS setup today in
production, and I am continuously seeing this error message into the
RRCPDL-Source logs :

-----
Fri Mar 08 10:58:34.871198 2019 /tmp/.rrcp/source.0.rrmp: WARNING:
[../Wrapper/Userlevel/rrcpCW_NetMgr.c,NetMgr_sendPkt(),274]
error writing to the network:
------

Then ADS logs packet losses :

-------
RRCP STATUS MSG: RRCP_BC_MISSEDMSGS: gap in broadcast msgs from node
169752968 (10.30.57.136) From Port 37000
--------

The RRCPDL parameters are :

------
[pars3pmdsc527:/rmds/users/mdadmin] rrdump source -DL -port 37000 -P

rrdump v6.7.F30: [423]:Connected with RRCP-daemonless, Version
rrcp6.7.T25(6070025), Control Port: 37000

RRCP Parameter Values (IpAddr: 10.30.57.136 (169752968) source mode,
device /tmp/.rrcp/source.1):

RRCP process version (6070025)

Ip Addr = 10.30.57.136 (169752968)
network = 10.30.57.136
recvAddr = 0.0.0.0 (INADDR_ANY)
MCRecvPort = 37002 (0x908a)
MCSendToPort = 37001 (0x9089)
PPPort = 37000 (0x9088)
MCSendFromPort = (0) not configured
devicePath = /tmp/.rrcp/source.0.rrmp
packetSize = 3100
maxPktPoolSize = 200000
pktPoolLimitHigh = 190000
pktPoolLimitLow = 180000
shuffleTolerance = 1024
userQLimit = 65535
tdata = 4
ndata = 7
nrreq = 3
trreq = 4
twait = 2
nmissing = 128
tbchold = 30
tpphold = 29
nackDelayTime = 20
bitmapFilter = 0
logger.level = 3
logger.file = /rmds/log/rrcpd/adh_0_source_rrcpdl.log
logger.maxSize = 52428800
logger.maxSwapfiles = 5

useIpMulticast = True
ipMultTTL = 16
network = net-feed
interface = 10.30.57.136
sendMultAddress = 239.254.0.168
recvMultAddress = 239.254.0.169
hsmInterface = 10.30.57.136
hsmMultAddress = 239.254.0.117
hsmPort = 30101
hsmInterval = 1
overflowMsgDump-oldest = 1
maxUsers = 10
recvPortLow = 0
recvPortHigh = 0
udpSendBufSize = 524288
udpRecvBufSize = 524288
nackDelayTime = 20
ackPackingRatio = 10
weightPPRetransSent = 1
weightPPRetransRcvd = 1
weightBCRRequestSent = 1
weightBCRRequestRcvd = 1
congestionHiWaterMark = 50
congestionLowWaterMark = 15
congestionEvaluationInterval = 5
sessionProps = (0x00000001)
congestionControl = enabled
switchReorderFix = disabled
multiThreadEngine = disabled
clock tick = 100
--------

And some statistics :

-------
RRCP Statistics (IpAddr: 10.30.57.136 (169752968) source mode, device
/tmp/.rrcp/source.1):

--- Fri Mar 8 11:12:26 2019

Total pkts sent: 14937491 Rxmt'd PP pkts sent:
2191
BC pkts sent: 14931213 Unack'd PP pkts:
283
PP pkts sent: 6278 RXMTREQPP pkts rcvd:
185
Total pkts rcvd: 920570 RXMTREQPP pkts sent:
25
BC pkts rcvd: 912267 Msgs from users:
14496073
PP pkts rcvd: 8303 BC msgs from users:
14493088
BC DATA pkts sent: 14493088 PP msgs from users:
2985
PP DATA pkts sent: 2985 Msgs to users:
433024
BC DATA pkts rcvd: 430193 DATA msgs to users:
432534
PP DATA pkts rcvd: 2341 BC DATA msgs to users:
430193
ACK pkts sent: 1077 PP DATA msgs to users:
2341
ACKs sent: 2341 STATUS msgs to users:
490
ACK pkts rcvd: 843 Bad pkts/from user:
0
ACKs rcvd: 2672 Bad pkts/from net:
0
PP DATA pkts ackd by wndw: 30 Discards/bad opcode:
0
RXMTREQs staged: 0 Discards/old BC:
0
RXMTREQs canceled: 0 Discards/old PP:
0
RXMTREQ pkts sent: 0 Discards/rxmt'd PP:
0
RXMTREQs sent: 0 Msgs filtered out:
0
Rxmt'd RXMTREQ pkts sent: 0 Loop Msg filtered out:
0
Rxmt'd RXMTREQs sent: 0 BC msgs misordered:
0
Rxmt'd BC pkts rcvd: 0 PP msgs misordered:
0
DISCARD pkts rcvd: 0 Lost data/BC gaps msgs:
0
RXMTREQ pkts rcvd: 1279 Lost data/BC packets:
0
RXMTREQs rcvd: 1819 Lost data/PP gaps msgs:
25
Rxmt'd RXMTREQ pkts rcvd: 3655 Lost data/PP packets:
26
Rxmt'd RXMTREQs rcvd: 5457 Lost data/node resync:
0
Rxmt'd BC pkts sent: 2137 Lost data/msg dscrd'd:
0
DISCARD pkts sent: 0 Lost data/incmplt msg:
0
NULL pkts sent: 435988 Lost data/user Q overflow:
0
NULL pkts rcvd: 482074 Pkt buffers in use:
6
Heartbeats sent: 0 Msg buffers in use:
5
Heartbeats rcvd: 0 Total bytes sent:
2156837849
Rxmt'd PP pkts rcvd: 0 Total bytes recv:
45516739

-----

We checked the server (cables, CRC errors, soft errors), and cannot see
any, can you please help to understand why this is failing ?

Even if there is a very small trafic onto this server pair, it looks
like RRCPd lost a lot of packets and this, regularly. I am not sure if
there is a coincidence with the fact that the ATS "listens" to the ADH
hotstandby state and maybe it creates trafic, I really don't know.

Any help will be appreciated !
Thanks,
Julien

(ps : I am afraid that the formatting of the statistics will not be helpful here , but it displays well in the support ticket).

Best Answer

  • zoya faberov
    zoya faberov ✭✭✭✭✭
    Answer ✓

    Hello @julien.dominici,

    Having read and re-read your question, it appears that it is not related to any of Refinitiv APIs.

    My understanding is, that, having just updated your deployed TREP infrastructure with the latest components, you are concerned with the packet loss, that you are observing.

    This forum is dedicated to questions, and general discussion, of Refinitiv APIs.

    The moderators of this forum are API experts, and are not equipped to suggest, guide, or contribute meaningfully toward the resolution of the issue that you are facing.

    Instead, we suggest that you contact your Refinitiv account team, to discuss the available options to proceed, enabling you to obtain the help that you need, the best path toward the resolution.

    To give them a heads-up,I will also copy them on this question,

    -AHS