Our team is developing a Java application using ETA API to obtain and contribution information from TREP. I have been following the online guidance developing the consumer application and so far there's a concerning bug. Our application occasionally lost channel connection with TREP and we have to setup automatic recovering method to keep the market information live. The disconnect happens randomly, range from once a week to once a day. I have followed the following link's example to program the ping handler that will periodically check the channel status. I will assume the channel connection is bad if my ping action returns return code that's less than TransportReturnCodes.SUCCESS.
How can I trace down the root cause of this failing ping attempts? Is there something wrong with the client application / example code or the TREP ADS service?
Please help, and thank you for your time.
First, we need to determine which side (TREP or client application) has cut the connection.
Typically, when TREP cuts the connection, the reason for disconnection will show in the ADS log file. There are two reasons for TREP to cut the connection.
1. Buffer overflow condition. This indicates that the application is a slow consumer which is unable to handle the number of messages sent by ADS
User user at position 10.42.68.175/net on host host1 using application256 on channel 257 has been disconnected due to an overflow condition.
2. Ping timeout. This indicates that the application didn't send ping messages to ADS
RSSL disconnect from "user" at position "10.42.68.175/U8009686-TPL-A" on host "host1" using application "256" on channel 19. Reason: Client application did not ping.
Please verify the ADS log for the reason of disconnection if TREP has cut the connection.
Update from @ccai via an email.
I believe the logging is enabled on the server end by our developers. And I can see the overflow and timeout information in the logging. However, I don’t think the error log capture the system disconnect related to our application, rather, it captured other people’s mistakes. I need to go through the latest log again once it’s shared with me. So far I don’t find the log that helpful.
Can you please suggest other ways to debug or possible solutions? I am happy to call with your team anytime to resolve this issue. This really make us concerned about the production environment and the stability of the API.