For a deeper look into our Elektron API, look into:

Overview |  Quickstart |  Documentation |  Downloads |  Tutorials |  Articles

question

Upvotes
Accepted
24 5 1 7

OmmConsumer, ping timeout, channel recover and maximum load.

Hi,
We have an application using EMA Java to subscribe, process and then republish some data to our TREP. We are seeing some issues where the OmmConsumerImpl loses the connection (channel down) and then fails to recover. We have a ChannelSet configuration and you can see from the logs that the OmmConsumerImpl is bringing up a channel, getting a ping timeout (30seconds) and then trying the next channel. The ADS servers were both up at the time of this failure and other applications were working fine so this is an issue within EMA failing the connection locally. We have seen the connection recover from a channel down at other times but the example below shows that it never recovers.

The overall issue is also why we get the channel down events in the first place? The ChannelSet failing to recover is also a fairly big issue.

Is there a maximum capacity that we should use a single instance of a OmmConsumer for? We are subscribing to around 8000 tickers and we were receiving about 10k updates/second at the time of the initial Channel Down event.

Perhaps we need to tune some of the parameters, we have:

Consumer:

  • LoginRequestTimeOut=30000ms
  • User dispatch,
  • MaxDispatchCountUserThread= default (100), should we set this higher given we expect 1000s of messages a second?
  • DispatchTimeoutApiThread=0 (default), should we adjust this?

Channel:

  • NumInputBuffers=2048
  • GuaranteedOutputBuffers=5000
  • ConnectionPingTimeout= Default (looks like the 30second default below)
  • SysRecvBufSize - we're not setting this, should we?
  • SysSendBufSize - we're not setting this, should we?


Here are the logs from the channel down events:

2018-05-04 14:35:41.110 WARN  [ElektronConsumerDispatcher] 
access.OmmConsumerImpl - loggerMsg| ClientName: ChannelCallbackClient| 
Severity: Warning| Text:  Received ChannelDownReconnecting event on chan
nel
 2.rmds.| RsslReactor @1d63c996| RsslChannel @4f10502a| Error Id 0| 
Internal sysError 0| Error Location null| Error text SocketChannel.read 
returned -1 (end-of-stream)| loggerMsgEnd|
2018-05-04 14:35:41.205 
WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: LoginCallbackClient| Severity: Warning| Text:  RDMLogin 
stream state was changed to suspect wit
h status message| username <not set>| usernameType <not set>| State: Open/Suspect/None - text: ""| loggerMsgEnd|
2018-05-04
 14:35:41.243 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: ChannelDictionary| Severity: Warning| Text:  
RDMDictionary stream state was changed to suspect with status message| 
streamId 3| Reason State: Open/Suspect/None - text: "channel down."| 
loggerMsgEnd|
2018-05-04 14:35:41.244 WARN  
[ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelDictionary| Severity: Warning| Text:  RDMDictionary 
stream state was changed to suspect with status message| streamId 4| 
Reason State: Open/Suspect/None - text: "channel down."| loggerMsgEnd|
2018-05-04
 14:35:41.245 WARN  [ElektronConsumerDispatcher] 
heartbeat.HeartbeatService - Error event for heartbeat message on HB. 
ErrorEvent{subject=HB, state=UNKNOWN, statusMsg=channel down.}
2018-05-04
 14:35:41.255 WARN  [ElektronConsumerDispatcher] 
subscribe.RmdsTickClient - Received: ErrorEvent{subject=.., 
state=UNKNOWN, statusMsg=channel down.}
...
2018-05-04 
14:35:42.877 WARN  [ElektronConsumerDispatcher] subscribe.RmdsTickClient
 - Received: ErrorEvent{subject=.., state=UNKNOWN, statusMsg=channel 
down.}
2018-05-04 14:35:42.877 WARN  [ElektronConsumerDispatcher] 
subscribe.RmdsTickClient - Received: ErrorEvent{subject=.., 
state=UNKNOWN, statusMsg=channel down.}
2018-05-04 14:35:42.878 INFO 
 [ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelCallbackClient| Severity: Info| Text:  Received 
ChannelUp event on channel 1.rmds.| Instance Name CONSUMER_B_PREPROD_1| 
Component Version ads2.6.9.L1.linux.tis.rrg 64-bit| loggerMsgEnd|
...
2018-05-04
 14:40:44.365 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: ChannelCallbackClient| Severity: Warning| Text: 
 Received ChannelDownReconnecting event on channel 1.rmds.| RsslReactor 
@1d63c996| RsslChannel @4f10502a| Error Id 0| Internal sysError 0| Error
 Location Reactor.processWorkerEvent| Error text Ping error for channel:
 Lost contact with connection...| loggerMsgEnd|
2018-05-04 
14:40:44.366 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: LoginCallbackClient| Severity: Warning| Text:  
RDMLogin stream state was changed to suspect with status message| 
username <not set>| usernameType <not set>| State: 
Open/Suspect/None - text: ""| loggerMsgEnd|
2018-05-04 14:40:44.367 
WARN  [ElektronConsumerDispatcher] subscribe.RmdsTickClient - Received: 
ErrorEvent{subject=.., state=UNKNOWN, statusMsg=channel down.}
...
2018-05-04
 14:40:45.391 INFO  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: ChannelCallbackClient| Severity: Info| Text:  
Received ChannelUp event on channel 2.rmds.| Instance Name 
CONSUMER_B_PREPROD_1| Component Version ads2.6.9.L1.linux.tis.rrg 
64-bit| loggerMsgEnd|


Later on, we had several ping timeouts:

2018-05-04 15:08:56.215 WARN  [ElektronConsumerDispatcher] 
access.OmmConsumerImpl - loggerMsg| ClientName: ChannelDictionary| 
Severity: Warning| Text:  RDMDictionary stream state was changed to 
suspect with status message| streamId 4| Reason State: Open/Suspect/None
 - text: "channel down."| loggerMsgEnd|
2018-05-04 15:08:56.197 WARN 
 [ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelCallbackClient| Severity: Warning| Text:  Received 
ChannelDownReconnecting event on channel 2.rmds.| RsslReactor @6091c183|
 RsslChannel @26d64e5a| Error Id 0| Internal sysError 0| Error Location 
Reactor.processWorkerEvent| Error text Ping error for channel: Lost 
contact with connection...| loggerMsgEnd|
2018-05-04 15:08:56.204 
WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: LoginCallbackClient| Severity: Warning| Text:  RDMLogin 
stream state was changed to suspect with status message| username 
<not set>| usernameType <not set>| State: Open/Suspect/None -
 text: ""| loggerMsgEnd|
2018-05-04 15:08:56.213 WARN  
[ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelDictionary| Severity: Warning| Text:  RDMDictionary 
stream state was changed to suspect with status message| streamId 3| 
Reason State: Open/Suspect/None - text: "channel down."| loggerMsgEnd|
2018-05-04
 15:08:57.269 INFO  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: ChannelCallbackClient| Severity: Info| Text:  
Received ChannelUp event on channel 1.rmds.| Instance Name 
CONSUMER_B_PREPROD_1| Component Version ads2.6.9.L1.linux.tis.rrg 
64-bit| loggerMsgEnd|
2018-05-04 15:17:14.069 WARN  
[ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelCallbackClient| Severity: Warning| Text:  Received 
ChannelDownReconnecting event on channel 1.rmds.| RsslReactor @6091c183|
 RsslChannel @26d64e5a| Error Id 0| Internal sysError 0| Error Location 
Reactor.processWorkerEvent| Error text Ping error for channel: Lost 
contact with connection...| loggerMsgEnd|
2018-05-04 15:17:14.069 
WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: LoginCallbackClient| Severity: Warning| Text:  RDMLogin 
stream state was changed to suspect with status message| username 
<not set>| usernameType <not set>| State: Open/Suspect/None -
 text: ""| loggerMsgEnd|
2018-05-04 15:17:15.116 INFO  
[ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelCallbackClient| Severity: Info| Text:  Received 
ChannelUp event on channel 2.rmds.| Instance Name CONSUMER_B_PREPROD_1| 
Component Version ads2.6.9.L1.linux.tis.rrg 64-bit| loggerMsgEnd|
2018-05-04
 15:19:44.217 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: ChannelCallbackClient| Severity: Warning| Text: 
 Received ChannelDownReconnecting event on channel 2.rmds.| RsslReactor 
@6091c183| RsslChannel @26d64e5a| Error Id 0| Internal sysError 0| Error
 Location Reactor.processWorkerEvent| Error text Ping error for channel:
 Lost contact with connection...| loggerMsgEnd|
2018-05-04 
15:19:44.218 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: LoginCallbackClient| Severity: Warning| Text:  
RDMLogin stream state was changed to suspect with status message| 
username <not set>| usernameType <not set>| State: 
Open/Suspect/None - text: ""| loggerMsgEnd|
2018-05-04 15:19:45.258 
INFO  [ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelCallbackClient| Severity: Info| Text:  Received 
ChannelUp event on channel 1.rmds.| Instance Name CONSUMER_B_PREPROD_1| 
Component Version ads2.6.9.L1.linux.tis.rrg 64-bit| loggerMsgEnd|
2018-05-04
 16:07:23.093 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: ChannelCallbackClient| Severity: Warning| Text: 
 Received ChannelDownReconnecting event on channel 1.rmds.| RsslReactor 
@6091c183| RsslChannel @26d64e5a| Error Id 0| Internal sysError 0| Error
 Location Reactor.processWorkerEvent| Error text Ping error for channel:
 Lost contact with connection...| loggerMsgEnd|
2018-05-04 
16:07:23.094 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: LoginCallbackClient| Severity: Warning| Text:  
RDMLogin stream state was changed to suspect with status message| 
username <not set>| usernameType <not set>| State: 
Open/Suspect/None - text: ""| loggerMsgEnd|
2018-05-04 16:07:24.663 
INFO  [ElektronConsumerDispatcher] access.OmmConsumerImpl - loggerMsg| 
ClientName: ChannelCallbackClient| Severity: Info| Text:  Received 
ChannelUp event on channel 2.rmds.| Instance Name CONSUMER_B_PREPROD_1| 
Component Version ads2.6.9.L1.linux.tis.rrg 64-bit| loggerMsgEnd|
2018-05-04
 16:22:31.342 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: ChannelCallbackClient| Severity: Warning| Text: 
 Received ChannelDownReconnecting event on channel 2.rmds.| RsslReactor 
@6091c183| RsslChannel @26d64e5a| Error Id 0| Internal sysError 0| Error
 Location Reactor.processWorkerEvent| Error text Ping error for channel:
 Lost contact with connection...| loggerMsgEnd|
2018-05-04 
16:22:31.343 WARN  [ElektronConsumerDispatcher] access.OmmConsumerImpl -
 loggerMsg| ClientName: LoginCallbackClient| Severity: Warning| Text:  
RDMLogin stream state was changed to suspect with status message| 
username <not set>| usernameType <not set>| State: 
Open/Suspect/None - text: ""| loggerMsgEnd|
elektronrefinitiv-realtimeelektron-sdk
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Hello @tbaker

Normally Developer Community Forum is for how-to/general questions but your question is seems to be complex and requires a deep investigation e.g. review your source code, configuration, try to reproduce the problem from API specialist team(my team). To get my team support, please contact you colleague, jblackburn@ahl.com, who is a premium user and can submit a ticket to us directly via Premium Support. If you would like to be a premium user, please send an email to rdc.administrator@thomsonreuters.com

tbaker avatar image   tbaker pimchaya.wongrukun01

Hi Pimchaya,

Thanks for the reply. I'll send a request over to become a premium user.

Thanks

Tom

Hello @tbaker

After you are a premium user, please submit your problem to my team directly via Premium Support. For any difficulties submitting the query to us, please contact rdc.administrator@thomsonreuters.com

The client has requested support for this issue through "Contact Premium Support". The case number is 06565532.

Case 06565532 remains under investigation, extending triage

<AHS Only>

Based on the case 06565532, the clients said that they migrated to EMA version 3.2.0.1 instead and the problem is no longer appear.The case is now temporary closed.

Hello @tbaker

Thank you for your participation in the forum. Is the reply below satisfactory in resolving your query?

If so please can you click the 'Accept' text next to the appropriate reply. This will guide all community members who have a similar question.

Thanks,

AHS

1 Answer

· Write an Answer
Upvotes
Accepted
1.9k 7 10 16

After the investigation, somehow this “Error text Ping error for channel: Lost contact with connection...” error message doesn't occur with EMA version 3.2.0.1 (Elektron-SDK_1.2.0.1.L1). It could be used as a workaround/resolution for this issue.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

We haven't seen it again yet on the new version but it's fairly intermittent so we'll keep an eye out for any new occurences.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.