A mix of Request timeout/channel down/Service not up

Our feedhandler uses C# Real-Time SDK, it subscribed to ~5000 names. On Dec 18, 2024 we notice that there were some issue from ~14:30 to 16:30 EST. The below are some of the samples we received:

2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: CASY.O, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: VIRC.OQ, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: CMPS.O, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: CMRE.K, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: VIRT.OQ, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: CZR.O, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: VIS.P, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: CZWI.O, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: VIST.N, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 14:51:58.9900 | Refinitiv StatusMsg: Name: EFAV.K, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
... a lot more similar rows…
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: QFIN.OQ, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: WAY.O, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: QGEN.N, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: WB.O, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: MKZR.OQ, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: MYRG.OQ, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: WBA.O, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: GRVY.OQ, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: WBD.O, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: GS.N, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: WBTN.O, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 15:47:17.0643 | Refinitiv StatusMsg: Name: IMTX.OQ, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
... a lot more similar rows…
2024-12-18 16:15:23.0855 | Refinitiv StatusMsg: Name: PAYC.N, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 16:15:23.0855 | Refinitiv StatusMsg: Name: SONY.K, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
2024-12-18 16:15:23.0855 | Refinitiv StatusMsg: Name: SOUN.O, ServiceName: hEDD, State: Open / Suspect / None / 'channel down.' 
2024-12-18 16:15:23.0855 | Refinitiv StatusMsg: Name: PB.N, ServiceName: hEDD, State: Open / Suspect / None / 'Service not up' 
2024-12-18 16:18:34.3745 | Refinitiv StatusMsg: Name: QTEC.O, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 16:18:34.3745 | Refinitiv StatusMsg: Name: CXM.N, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 16:18:34.3745 | Refinitiv StatusMsg: Name: QTRX.O, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 
2024-12-18 16:18:34.3745 | Refinitiv StatusMsg: Name: CXW.N, ServiceName: hEDD, State: Open / Suspect / None / 'Request timeout' 

We already opened a ticket for it (not sure if you have access to the ticketing system, but fyi: 14220838) and we asked the support team to login to the server see if anything is wrong.

But we are also interested in the technology aspect, what do these three message types mean? We heard from somewhere that channel down means we have slow consumers or slow network, can you let us have a bit more insights?

Thanks,

Sort by:
1 - 3 of 31

    Hi @Jirapongse .

    Thanks for this answer. Can I summarize the below:

    1. If we see only channel down in log, we can not rule out slow consumers is to be blamed—we need to open a ticket to check the service log to be sure.
    2. If we see request timeout only or request timeout + channel down in log, it is a strong signal that the problem is on the network. My reasoning: request timeout means the client side SDK code is still waiting for service response and it fails to receive one in 15 sec. As the "slow consumer" typically means our callback function implementation (i.e., the OnUpdateMsg())is too slow, in this "request timeout" scenario, OnUpdateMsg() is not invoked at all, so request timeout is most likely not a slow consumer issue, but a connectivity issue.
    3. If we see Service not up in the log, it becomes a bit more complicated, it could be the case that the server side program is not running.

    We are especially interested in the 2nd point, as we are currently trying to identify the issue: is it slow consumer or slow network to be blame.

    wasin.wUser: "wasin.w"
    admin
    Accepted Answer
    Updated by wasin.w

    Hello @Y_Intercept

    Please be informed that you can set the RequestTimeOut parameter in the EmaConfig.xml file to tweak the amount of time (in milliseconds) the OmmConsumer waits for a response message.

    req_timeout.png

    @Y_Intercept

    You can initially check if this is a network issue by running the testclient tool on the same network to determing the update rate of all subscribed items.

    The tool is in the ADS package and you can also download this tool (Infrastructure Tools) from the Software Downloads. It is in the MDS - Infra/Infrastructure Tools category.

    First, create a rics.txt file that contains all subscribed RICs. For example:

    CASY.OGRVY.OQ
    

    Then, run the test client with the following parameters.

    ./testclient -h <server ip> -p <server port> -S <service name> -f rics.txt -I 1 -u <user>
    

    The output will look like this:

    image.png

    You may need to run it between ~14:30 to 16:30 EST. If the testclient can handle all updates without any disconnections, it is probably not a network issue.

    With the result from the testclient, you will get the average update rate during the peak period. After that, you can check if the application can properly handle that update rate.