We have a ETA provider app which is getting the error below:
thirdparty/elektron/Elektron-SDK/Cpp-C/Eta/Impl/Transport/rsslSocketTransportImpl.c:3412> Error: 1002 ipcWrite() failed. System errno: (32)
At the same time the consumer application (rmdstestclient) is getting the following error:
Channel: 1 disconnected. Reason: RSSL Channel has disconnected on read failure. <Impl/ripcsrvr.c:6876> Error:1002 ripcRead() failure. System errno: (104)
Both applications are sitting on the same AWS EC2 host. Can you provide any advice on system tuning or developement changes that could help prevent these errors?
Those errors look suspiciously like ones we see with a slow consumer scenario - where the consumer is not reading the data published by the ADS/provider in a timely manner - so the outbound buffer overflows and the consumer is disconnected.
This is often because the client consumer app is doing too much work in the callbacks and thereby starving the API thread which then can't read the data quickly enough. However, I cant see rmdstestclient doing the same as it was designed to be a lean high-performance test tool.
I don't know that much about AWS, so could it be that the AWS instance is not sized correctly and thereby the consumer is not getting enough CPU time?
Is the consumer connected directly to the Provider or via an ADS(+ADH) or RTC? If so, the ADS or RTC log files should confirm if you are in a buffer overflow scenario.
Also, in the TransportAPI_C_PerfToolsGuide.pdf that is included with the RT-SDK documentation, there is a section on Performance Best Practices section which provides guidance in this area e.g. adjusting maxOutputBuffers and guaranteedOutputBuffers values. You can also find more details on the various parameters in the TransportAPI_C_DevGuide.pdf