hi I am trying to run rmon to understand some packet congestion we see on TREP network. All 4 infra processes are running in docker using "host" networking. I have extracted the rrcp ports from the process config files which gave me :
-r is from *RRCP*ports : 0x4e20 0x4e21 0x4e30 0x4e31
-R is from *sink*RRCP*recvMultAddress
-W is from *source*RRCP*recvMultAddress
./rmon -S -i bond0 -p 1 -R *.*.*.2 -W *.*.*.3 -k 4 -r 20000 -w 20001 2> err.log
this gives me the following output:
================================================================================ NETWORK STATISTICS ================================================================================ Thu Oct 6 13:20:56 Sampling Time: 1.01 Time Elapsed: 12.6 Pkts Dropped: 1 Bad RRCP:: Opcode: 0 Version: 0 Short frame: 0 Short hdr: 0 Bad RRCP:: Old BC: 2 Gap BC: 1977 Delta BC: 593 Old NPN: 4 Delta NPN: 2 Bad RRCP:: Disc BC: 0 Rereq BC: 0 Lost PP: 0 Lost BC: 6088 RRCP:: tdata: 0.0 tack: 0.0 c_tack: 0.0 a_upd6: 19.0 c_upd6: 18.8 asm:: dropped: 724(old) 0(full) Q:1458/10000 Duplicate Server List: Duplicate Node List: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Average Current Current NETWORK 1000 Mbps Load in % Total Rate Rate Net Load ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ network pkts 1263573 100449 93069 70 pp network pkts 1107600 88050 81982 61 bcast network pkts 155973 12399 11087 8 -------------------------------------------------------------------------------- tcp pkts 1105178 87857 81849 61 -------------------------------------------------------------------------------- udp pkts 158331 12586 11210 9 pp udp pkts 2406 191 133 0 bcast udp pkts 155925 12395 11077 8 -------------------------------------------------------------------------------- rrcp pkts 156116 12410 11087 8 pp rrcp pkts 197 15 10 0 pp rrcp pkts sink 98 7 5 0 pp rrcp pkts src 99 7 5 0 bcast rrcp pkts 155919 12395 11077 8 bcast rrcp pkts sink 155915 12394 11077 0 bcast rrcp pkts src 4 0 0 0 -------------------------------------------------------------------------------- rrcp data msgs 53231 4231 3793 0 rrcp data pkts 156015 12402 11082 0 pp rrcp data pkts 99 7 5 0 bcast rrcp data pkts 155916 12394 11077 8 -------------------------------------------------------------------------------- rrcp control pkts 101 8 5 0 pp rrcp control pkts 98 7 5 0 bcast rrcp control pkts 3 0 0 0 -------------------------------------------------------------------------------- retrans rrcp data pkts 0 0 0 0 retrans pp rrcp data pkts 0 0 0 0 re pp rrcp data pkts sink 0 0 0 0 re pp rrcp data pkts src 0 0 0 0 retrans bcast rrcp data pkts 0 0 0 0 re bcast rrcp data pkts sink 0 0 0 0 re bcast rrcp data pkts src 0 0 0 0 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Average Current Current RRMP 6 Total Rate Rate Length ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ rssl_change_req_id 154 12 6 27 rssl_identifier 600 47 48 4 rssl_sink_state 1 0 0 0 rssl_src_change_config 604 48 48 10 rssl_src_state 600 47 48 16 rssl_image_data 26 2 0 0 rssl_image_hdr 38 3 0 0 rssl_status 142 11 6 64 rssl_update 985078 78310 68584 141 rssl_close 2 0 0 0 rssl_open 79 6 4 30 rssl_sink_priority 27 2 1 21 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ RRCP NODE TABLE NodeId BCRereq Discards PPRetrans ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 10.220.65.17 snk 33202 0 0 0 10.220.65.8 snk 33102 0 0 0 10.220.65.8 src 33101 0 0 0 10.220.65.17 src 33201 0 0 0 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ RRCP NODE TABLE Pkts PktRate BCPkts PPPkts ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 10.220.65.8 snk 107493 7682 107480 13 10.220.65.17 snk 48520 3400 48435 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I have a few questions:
Does this output look reasonable?
However if I run using the other two rrcp ports 20016/20017 the RRMP6 section at the bottom disappears - why is that?
Can you advise if I am running the right rmon parameters to investigate congestion on the TREP network?
Also can you point out which stats are important regarding congestion messages (which lead to dropped ticks)?
Do I need to run rmon in each of the 4 TREP containers to fully analyse the TREP infrastructure (ie rrcpd, rrcpd, ads, adh)
How do I check which NIC TREP is using for each container?
any other comments? Sorry the question is a bit nebulous, but I couldnt find much documentation on this.