hi I am trying to run rmon to understand some packet congestion we see on TREP network. All 4 infra processes are running in docker using "host" networking. I have extracted the rrcp ports from the process config files which gave me :
-r is from *RRCP*ports : 0x4e20
0x4e21 0x4e30 0x4e31
-R is from *sink*RRCP*recvMultAddress
-W is from *source*RRCP*recvMultAddress
./rmon -S -i bond0 -p 1 -R *.*.*.2 -W *.*.*.3 -k 4 -r 20000 -w 20001 2> err.log
this gives me the following output:
================================================================================
NETWORK STATISTICS
================================================================================
Thu Oct 6 13:20:56 Sampling Time: 1.01 Time Elapsed: 12.6 Pkts Dropped: 1
Bad RRCP:: Opcode: 0 Version: 0 Short frame: 0 Short hdr: 0
Bad RRCP:: Old BC: 2 Gap BC: 1977 Delta BC: 593 Old NPN: 4 Delta NPN: 2
Bad RRCP:: Disc BC: 0 Rereq BC: 0 Lost PP: 0 Lost BC: 6088
RRCP:: tdata: 0.0 tack: 0.0 c_tack: 0.0
a_upd6: 19.0 c_upd6: 18.8
asm:: dropped: 724(old) 0(full) Q:1458/10000
Duplicate Server List:
Duplicate Node List:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Average Current Current
NETWORK 1000 Mbps Load in % Total Rate Rate Net Load
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
network pkts 1263573 100449 93069 70
pp network pkts 1107600 88050 81982 61
bcast network pkts 155973 12399 11087 8
--------------------------------------------------------------------------------
tcp pkts 1105178 87857 81849 61
--------------------------------------------------------------------------------
udp pkts 158331 12586 11210 9
pp udp pkts 2406 191 133 0
bcast udp pkts 155925 12395 11077 8
--------------------------------------------------------------------------------
rrcp pkts 156116 12410 11087 8
pp rrcp pkts 197 15 10 0
pp rrcp pkts sink 98 7 5 0
pp rrcp pkts src 99 7 5 0
bcast rrcp pkts 155919 12395 11077 8
bcast rrcp pkts sink 155915 12394 11077 0
bcast rrcp pkts src 4 0 0 0
--------------------------------------------------------------------------------
rrcp data msgs 53231 4231 3793 0
rrcp data pkts 156015 12402 11082 0
pp rrcp data pkts 99 7 5 0
bcast rrcp data pkts 155916 12394 11077 8
--------------------------------------------------------------------------------
rrcp control pkts 101 8 5 0
pp rrcp control pkts 98 7 5 0
bcast rrcp control pkts 3 0 0 0
--------------------------------------------------------------------------------
retrans rrcp data pkts 0 0 0 0
retrans pp rrcp data pkts 0 0 0 0
re pp rrcp data pkts sink 0 0 0 0
re pp rrcp data pkts src 0 0 0 0
retrans bcast rrcp data pkts 0 0 0 0
re bcast rrcp data pkts sink 0 0 0 0
re bcast rrcp data pkts src 0 0 0 0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Average Current Current
RRMP 6 Total Rate Rate Length
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rssl_change_req_id 154 12 6 27
rssl_identifier 600 47 48 4
rssl_sink_state 1 0 0 0
rssl_src_change_config 604 48 48 10
rssl_src_state 600 47 48 16
rssl_image_data 26 2 0 0
rssl_image_hdr 38 3 0 0
rssl_status 142 11 6 64
rssl_update 985078 78310 68584 141
rssl_close 2 0 0 0
rssl_open 79 6 4 30
rssl_sink_priority 27 2 1 21
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
RRCP NODE TABLE NodeId BCRereq Discards PPRetrans
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
10.220.65.17 snk 33202 0 0 0
10.220.65.8 snk 33102 0 0 0
10.220.65.8 src 33101 0 0 0
10.220.65.17 src 33201 0 0 0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
RRCP NODE TABLE Pkts PktRate BCPkts PPPkts
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
10.220.65.8 snk 107493 7682 107480 13
10.220.65.17 snk 48520 3400 48435 85
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I have a few questions:
Does this output look reasonable?
However if I run using the other two rrcp ports 20016/20017 the RRMP6 section at the bottom disappears - why is that?
Can you advise if I am running the right rmon parameters to investigate congestion on the TREP network?
Also can you point out which stats are important regarding congestion messages (which lead to dropped ticks)?
Do I need to run rmon in each of the 4 TREP containers to fully analyse the TREP infrastructure (ie rrcpd, rrcpd, ads, adh)
How do I check which NIC TREP is using for each container?
any other comments? Sorry the question is a bit nebulous, but I couldnt find much documentation on this.