Hi dear developpers team and community, this if following case 07426946 where I was asked to post my question here.
I deployed a new ATS + ADH + ADS setup today in
production, and I am continuously seeing this error message into the
RRCPDL-Source logs :
-----
Fri Mar 08 10:58:34.871198 2019 /tmp/.rrcp/source.0.rrmp: WARNING:
[../Wrapper/Userlevel/rrcpCW_NetMgr.c,NetMgr_sendPkt(),274]
error writing to the network:
------
Then ADS logs packet losses :
-------
RRCP STATUS MSG: RRCP_BC_MISSEDMSGS: gap in broadcast msgs from node
169752968 (10.30.57.136) From Port 37000
--------
The RRCPDL parameters are :
------
[pars3pmdsc527:/rmds/users/mdadmin] rrdump source -DL -port 37000 -P
rrdump v6.7.F30: [423]:Connected with RRCP-daemonless, Version
rrcp6.7.T25(6070025), Control Port: 37000
RRCP Parameter Values (IpAddr: 10.30.57.136 (169752968) source mode,
device /tmp/.rrcp/source.1):
RRCP process version (6070025)
Ip Addr = 10.30.57.136 (169752968)
network = 10.30.57.136
recvAddr = 0.0.0.0 (INADDR_ANY)
MCRecvPort = 37002 (0x908a)
MCSendToPort = 37001 (0x9089)
PPPort = 37000 (0x9088)
MCSendFromPort = (0) not configured
devicePath = /tmp/.rrcp/source.0.rrmp
packetSize = 3100
maxPktPoolSize = 200000
pktPoolLimitHigh = 190000
pktPoolLimitLow = 180000
shuffleTolerance = 1024
userQLimit = 65535
tdata = 4
ndata = 7
nrreq = 3
trreq = 4
twait = 2
nmissing = 128
tbchold = 30
tpphold = 29
nackDelayTime = 20
bitmapFilter = 0
logger.level = 3
logger.file = /rmds/log/rrcpd/adh_0_source_rrcpdl.log
logger.maxSize = 52428800
logger.maxSwapfiles = 5
useIpMulticast = True
ipMultTTL = 16
network = net-feed
interface = 10.30.57.136
sendMultAddress = 239.254.0.168
recvMultAddress = 239.254.0.169
hsmInterface = 10.30.57.136
hsmMultAddress = 239.254.0.117
hsmPort = 30101
hsmInterval = 1
overflowMsgDump-oldest = 1
maxUsers = 10
recvPortLow = 0
recvPortHigh = 0
udpSendBufSize = 524288
udpRecvBufSize = 524288
nackDelayTime = 20
ackPackingRatio = 10
weightPPRetransSent = 1
weightPPRetransRcvd = 1
weightBCRRequestSent = 1
weightBCRRequestRcvd = 1
congestionHiWaterMark = 50
congestionLowWaterMark = 15
congestionEvaluationInterval = 5
sessionProps = (0x00000001)
congestionControl = enabled
switchReorderFix = disabled
multiThreadEngine = disabled
clock tick = 100
--------
And some statistics :
-------
RRCP Statistics (IpAddr: 10.30.57.136 (169752968) source mode, device
/tmp/.rrcp/source.1):
--- Fri Mar 8 11:12:26 2019
Total pkts sent: 14937491 Rxmt'd PP pkts sent:
2191
BC pkts sent: 14931213 Unack'd PP pkts:
283
PP pkts sent: 6278 RXMTREQPP pkts rcvd:
185
Total pkts rcvd: 920570 RXMTREQPP pkts sent:
25
BC pkts rcvd: 912267 Msgs from users:
14496073
PP pkts rcvd: 8303 BC msgs from users:
14493088
BC DATA pkts sent: 14493088 PP msgs from users:
2985
PP DATA pkts sent: 2985 Msgs to users:
433024
BC DATA pkts rcvd: 430193 DATA msgs to users:
432534
PP DATA pkts rcvd: 2341 BC DATA msgs to users:
430193
ACK pkts sent: 1077 PP DATA msgs to users:
2341
ACKs sent: 2341 STATUS msgs to users:
490
ACK pkts rcvd: 843 Bad pkts/from user:
0
ACKs rcvd: 2672 Bad pkts/from net:
0
PP DATA pkts ackd by wndw: 30 Discards/bad opcode:
0
RXMTREQs staged: 0 Discards/old BC:
0
RXMTREQs canceled: 0 Discards/old PP:
0
RXMTREQ pkts sent: 0 Discards/rxmt'd PP:
0
RXMTREQs sent: 0 Msgs filtered out:
0
Rxmt'd RXMTREQ pkts sent: 0 Loop Msg filtered out:
0
Rxmt'd RXMTREQs sent: 0 BC msgs misordered:
0
Rxmt'd BC pkts rcvd: 0 PP msgs misordered:
0
DISCARD pkts rcvd: 0 Lost data/BC gaps msgs:
0
RXMTREQ pkts rcvd: 1279 Lost data/BC packets:
0
RXMTREQs rcvd: 1819 Lost data/PP gaps msgs:
25
Rxmt'd RXMTREQ pkts rcvd: 3655 Lost data/PP packets:
26
Rxmt'd RXMTREQs rcvd: 5457 Lost data/node resync:
0
Rxmt'd BC pkts sent: 2137 Lost data/msg dscrd'd:
0
DISCARD pkts sent: 0 Lost data/incmplt msg:
0
NULL pkts sent: 435988 Lost data/user Q overflow:
0
NULL pkts rcvd: 482074 Pkt buffers in use:
6
Heartbeats sent: 0 Msg buffers in use:
5
Heartbeats rcvd: 0 Total bytes sent:
2156837849
Rxmt'd PP pkts rcvd: 0 Total bytes recv:
45516739
-----
We checked the server (cables, CRC errors, soft errors), and cannot see
any, can you please help to understand why this is failing ?
Even if there is a very small trafic onto this server pair, it looks
like RRCPd lost a lot of packets and this, regularly. I am not sure if
there is a coincidence with the fact that the ATS "listens" to the ADH
hotstandby state and maybe it creates trafic, I really don't know.
Any help will be appreciated !
Thanks,
Julien
(ps : I am afraid that the formatting of the statistics will not be helpful here , but it displays well in the support ticket).