We've written a service running in AWS using Java EMA 184.108.40.206 and connecting to EDP. It's receiving MMT_MARKET_PRICE and MMT_MARKET_BY_ORDER messages for 2,600 ASX RICs.
Occasionally our service is failing randomly with an OutOfMemoryError: GC overhead limit exceeded. It can run for days without issues and then fails. We're monitoring memory use can can't see any issues prior to failure.
About 5 minutes prior to failure we see this error in the log from EMA
loggerMsg\n ClientName: ChannelCallbackClient\n Severity: Warning\n Text: Received ChannelDownReconnecting event on channel Channel_3\n\tRsslReactor @44d8d44\n\tRsslChannel @25739122\n\tError Id 0\n\tInternal sysError 0\n\tError Location null\n\tError text CompressorException: invalid code lengths set\nloggerMsgEnd\n\n
Then when the channel reconnects we see these errors coming from EMA prior to the channel going down again.
Unknown msgClass: 0
Unknown msgClass: 19
Unknown msgClass: 14
Unknown msgClass: 25
Unknown msgClass: 13
Unknown msgClass: 29
Unknown msgClass: 17
Unknown msgClass: 26
Unknown msgClass: 30
Unknown msgClass: 27
This can happen several times then the POD CPU goes to 100% and eventually fails with a memory exception.
Subsequent to the first ChannelDownReconnecting event above we see an these errors with the channel going down then reconnects again.
loggerMsg\n ClientName: ChannelCallbackClient\n Severity: Warning\n Text: Received ChannelDownReconnecting event on channel Channel_3\n\tRsslReactor @44d8d44\n\tRsslChannel @12980f25\n\tError Id -1\n\tInternal sysError 0\n\tError Location Reactor.performChannelRead\n\tError text \nloggerMsgEnd\n\n
I haven't been able to replicate this issue locally in Dev so it's difficult to determine if the EMA errors are the cause of our service failure or a symptom of some other issue.
What does 'CompressorException: invalid code lengths set' mean?
And why are these 'Unknown msgClass' messages raised?
Unknown msgClass usually indicates the Message Type (Refresh, Update, Status etc) is not defined for this event/message - indicating a corrupt/invalid payload content.
The only place I can see the '"invalid code lengths set" message is in the zlib library source code.
Given that you are losing your connection to ERT (EDP) - you should raise a ticket with the ERT in Cloud team to investigate these connectivity losses. Make sure you choose Elektron Real Time in Cloud as the Product (and not an API - otherwise it will go to the wrong team).
You could also raise a ticket with Developer Support to investigate if data is getting corrupted at the API level perhaps after a connection loss/recovery scenario?