We are running RFAJ 7.6.0.E9 through 188.8.131.52 on various Linux systems that connect to various ADS 2.5.0.L1 also running on Linux. The ADS disconnects the session due to not receiving 3 consecutive heartbeats from the RFAJ systgems. This has been confirmed by watching the communication between the RFAJ systems and the ADS's. The JVM, and the relevant code, has been verified to be running before and after the ADS initiates the disconnect. Has anyone seen this behavior? Does anyone have any suggestions for where to look on the RFAJ system to determine where, or why the heartbeats are not being sent?
No ping messages sent out would mean something is interfering with RFA Java's ping management. This could be a) the lack of CPU time, b) RFAJ thread was too busy, c) RFAJ thread exited abnormally.
To verify a), you may check if there was any resource issue on the client machines (e.g. CPU, memory, etc.), esp. when the clients are running on VMs.
For b), a possibility is when RFAJ thread is used for time-intensive event processing. This is when null event queue is used (null specified for event queue when invoking registerClient method).
For c), if the application was able to function normally (without a restart) after the disconnection, it would indicate RFAJ thread still functioned normally and c) can be ruled out.
Perhaps, it might occur when the application process is really busy and consume gargantuan resources, which could block RFA to perform its administrative operations. This problem is called a slow consumer problem. It usually occurs when the application subscribes to a lot of items or items that have a massive update rate, or the process item callback method has a time consuming logic.
However, this issue may also happen with a very tight/impractical pingInterval value as well (such as pingInterval = 1 or 2 seconds).