question

Upvote
Accepted
54 4 4 9

Explain RSSL serverList parameter behavior

In RFA configuration for RSSL consumer connections, we can set the parameter "serverList". According to this parameter description, RFA attempts to connect each servers in the list until it succeeds in establishing the connection.
Our parameter "serverSelectionOrder" is set "false" so that RFA tries to connect to each server in the same order as what is indicated in "serverList" parameter.

Our configuration of "serverList" is: server74, server75, server76, server09, server10.
Lately, we had the following behavior: RFA tries to connect to the first 3 servers:

  1. After the first server, we had the error: Information: Connection Status Changed, RSSL_Cons_Connection "Default::Connection_Sub server74:14002" State: "Down" StatusCode: "None" StatusText: Connection down
  2. After the second server, we had the error: Warning: RSSL Channel closing on connection "Default::Connection_Sub server75:14002" due to "channel disconnect"
  3. After the third server, we had the error: Warning: RSSL Channel closing on connection "Default::Connection_Sub server76:14002" due to "channel flush failure"
  4. After the third, RFA didn't try to connect to the 4th server but it went back to the first server.

Do you know if the error "channel flush failure" can explain this behavior ?

The complete error message is:

Error: RSSL Channel flush failed on connection "Default::Connection_Sub server76:14002".  Channel will be closed.  (Internal debug info: "<..\..\..\Rssl\Impl\rsslImpl.c:1692> rsslFlush() Error: 0007 Only Channels in RSSL_CH_STATE_ACTIVE state can flush.", errno: 0)


We had to following log messages:

[21:29:39]: (ComponentName) Static: (Severity) Warning: Batch request size may be limited due to throttling on RSSL_Cons_Connection "Default::Connection_Sub server74:14002"
[21:29:39]: (ComponentName) Static: (Severity) Information: Using the following configuration for RSSL_Cons_Connection "Default::Connection_Sub server74:14002": hostName = "server74", rsslPort = "14002", interfaceName = "localhost", compressionType = 0, channelBlocking = false, writeBufferMaxFragmentLength = 6144, connectionWaitTimeout = 30000, connectionPingTimeout = 30000, guaranteedOutputBuffers = 400, numInputBuffers = 5, tcp_nodelay = True, throttleEnabled = True, throttleType = count, throttleMaxCount = 250, throttleBatchCount = 10, traceMsg = False, traceMsgToFile = False
[21:29:39]: (ComponentName) Static: (Severity) Information: Using the following configuration for RSSL_Cons_Connection_Manager "Default::Connection_Sub": hostName = "", rsslPort = "14002", requestQueueReadThreshold = 1, watchListTableSize = 10000, requestTimeout = 45000, itemPostTimeout = 15000, serverList = { server74, server75, server76, server09, server10 }, Connection capabilities: OMMMsgConnCapability
[21:30:00]: (ComponentName) Static: (Severity) Warning: RSSL Channel closing on connection "Default::Connection_Sub server74:14002" due to "channel disconnect"
[21:30:00]: (ComponentName) Static: (Severity) Information: Connection Status Changed, RSSL_Cons_Connection "Default::Connection_Sub server74:14002" State: "Down" StatusCode: "None" StatusText: Connection down
[21:30:00]: (ComponentName) Static: (Severity) Information: RSSL_Cons_Connection "Default::Connection_Sub server75:14002" will be tried in 100 milliseconds
[21:30:21]: (ComponentName) Static: (Severity) Warning: RSSL Channel closing on connection "Default::Connection_Sub server75:14002" due to "channel disconnect"
[21:30:21]: (ComponentName) Static: (Severity) Information: Connection Status Changed, RSSL_Cons_Connection "Default::Connection_Sub server75:14002" State: "Down" StatusCode: "None" StatusText: Connection down
[21:30:21]: (ComponentName) Static: (Severity) Information: RSSL_Cons_Connection "Default::Connection_Sub server76:14002" will be tried in 100 milliseconds
[21:30:39]: (ComponentName) Static: (Severity) Error: RSSL Channel flush failed on connection "Default::Connection_Sub server76:14002".  Channel will be closed.  (Internal debug info: "<..\..\..\Rssl\Impl\rsslImpl.c:1692> rsslFlush() Error: 0007 Only Channels in RSSL_CH_STATE_ACTIVE state can flush.", errno: 0)
[21:30:39]: (ComponentName) Static: (Severity) Warning: RSSL Channel closing on connection "Default::Connection_Sub server76:14002" due to "channel flush failure"
[21:30:39]: (ComponentName) Static: (Severity) Information: Using the following configuration for RSSL_Cons_Connection_Manager "Default::Connection_Sub": hostName = "", rsslPort = "14002", requestQueueReadThreshold = 1, watchListTableSize = 10000, requestTimeout = 45000, itemPostTimeout = 15000, serverList = { server74, server75, server76, server09, server10 }, Connection capabilities: OMMMsgConnCapability
[21:30:39]: (ComponentName) Static: (Severity) Warning: Batch request size may be limited due to throttling on RSSL_Cons_Connection "Default::Connection_Sub server74:14002"

We're using RFA C++ 7.5.1.L1

Regards.

treprfarfa-apirsslconfigurationconsumer
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
Accepted
1.2k 23 28 42

When using UPA the terminology in the RFA logs makes a bit more sense. Basically your app received a TCP ACK from server76 then sends an RSSL handshake, whilst attempting to send those first few bytes of information the socket was closed and thus the "channel flush" failed. The "flush" command is used to send buffered messages on the wire, you will rarely see a "channel write" failed.

I would like to see a test with only "server09, server10" in the "serverList" to verify if the API can connect at all. If these servers are skipped it is an implication that they cannot be resolved to an IPv4 target address.

It is highly recommended to reproduce the scenario with the latest 8.0 or 7.6 emergency loads to verify the exhibited behaviour is not an actual defect that has already been resolved.

Also, try using the IPv4 address instead of the hostname.

You can null route or reject the target server IP addresses to temporarily test the cold standby failover mechanism, e.g.

echo -n "Blocking connectivity to ads1"
route add -host ads1 reject
read -p " ... "
route delete -host ads1 reject
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
54 4 4 9

Thanks for your answer.

The day after this issue, it worked properly with the 2 last servers:

[09:13:07]: (ComponentName) Static: (Severity) Warning: RSSL Channel closing on connection "Default::Connection_Rssl_Sub server75:14002" due to "channel read failure"
[09:13:07]: (ComponentName) Static: (Severity) Information: Connection Status Changed, RSSL_Cons_Connection "Default::Connection_Rssl_Sub server75:14002" State: "Down" StatusCode: "None" StatusText: Connection down
[09:13:07]: (ComponentName) Static: (Severity) Information: RSSL_Cons_Connection "Default::Connection_Rssl_Sub server76:14002" will be tried in 100 milliseconds
[09:13:29]: (ComponentName) Static: (Severity) Warning: RSSL Channel closing on connection "Default::Connection_Rssl_Sub server76:14002" due to "channel disconnect"
[09:13:29]: (ComponentName) Static: (Severity) Information: Connection Status Changed, RSSL_Cons_Connection "Default::Connection_Rssl_Sub server76:14002" State: "Down" StatusCode: "None" StatusText: Connection down
[09:13:29]: (ComponentName) Static: (Severity) Information: RSSL_Cons_Connection "Default::Connection_Rssl_Sub server09:14002" will be tried in 100 milliseconds
[09:13:29]: (ComponentName) Static: (Severity) Information: RSSL channel init in progress for connection "Default::Connection_Rssl_Sub server09:14002".
[09:13:29]: (ComponentName) Static: (Severity) Information: RSSL channel connected for connection "Default::Connection_Rssl_Sub server09:14002".
[09:13:29]: (ComponentName) Static: (Severity) Information: RSSL channel on connection "Default::Connection_Rssl_Sub server09:14002" has negotiated ping timeout set to: 30000
[09:13:29]: (ComponentName) Static: (Severity) Information: Connection Status Changed, RSSL_Cons_Connection "Default::Connection_Rssl_Sub server09:14002" State: "Up" StatusCode: "None" StatusText: Connection up
[09:13:33]: (ComponentName) Static: (Severity) Information: RSSL Channel closing on connection "Default::Connection_Rssl_Sub server09:14002" due to "requested disconnect"
[09:16:51]: (ComponentName) Static: (Severity) Error: OMMConsumer::registerClient() has been called with an item request prior to login request. Login request is expected first.


If I try to null route the target server IP address, it works properly and it switches to the next server. In this example, the server list is mkdppsusr01d, mkdppsbck01d. The null route is configured for mkdppsusr01d only:

[16:29:48,865]: (ComponentName) Static: (Severity) Information: The RSSL_Cons_Adapter initialization succeeded
[16:29:48,865]: (ComponentName) Static: (Severity) Information: Using the following configuration for RSSL_Cons_Connection_Manager "Default::Connection_SSLED_RMDS_Dict": requestQueueReadThreshold = 1, watchListTableSize = 10000, requestTimeout = 45000, itemPostTimeout = 15000, payloadCacheEnabled = False, payloadCacheLoadFileDictionary = True, payloadCacheDictionaryName = RDMFieldDictionary, payloadCacheDictionaryPerService = False, serverList = { mkdppsusr01d, mkdppsbck01d }, Connection capabilities: OMMMsgConnCapability
[16:29:48,866]: (ComponentName) Static: (Severity) Warning: Batch request size may be limited due to throttling on RSSL_Cons_Connection "Default::Connection_SSLED_RMDS_Dict mkdppsusr01d:14002"
[16:29:48,867]: (ComponentName) Static: (Severity) Information: Using the following configuration for RSSL_Cons_Connection "Default::Connection_SSLED_RMDS_Dict mkdppsusr01d:14002": hostName = "mkdppsusr01d", rsslPort = "14002", interfaceName = "localhost", compressionType = 0, channelBlocking = false, recvBufSize = 65535, sendBufSize = 65535, writeBufferMaxFragmentLength = 6144, connectionWaitTimeout = 30000, connectionPingTimeout = 30000, guaranteedOutputBuffers = 400, numInputBuffers = 5, tcp_nodelay = True, throttleEnabled = True, throttleType = count, throttleMaxCount = 250, throttleBatchCount = 10, traceMsg = False, traceMsgToFile = False, dictionaryRequestTimeout = 45
[16:29:48,966]: (ComponentName) Static: (Severity) Information: RSSL channel init in progress for connection "Default::Connection_SSLED_RMDS_Dict mkdppsusr01d:14002".
[16:29:48,968]: (ComponentName) Static: (Severity) Information: RSSL channel connected for connection "Default::Connection_SSLED_RMDS_Dict mkdppsusr01d:14002".
[16:29:48,968]: (ComponentName) Static: (Severity) Information: RSSL channel on connection "Default::Connection_SSLED_RMDS_Dict mkdppsusr01d:14002" has negotiated ping timeout set to: 30000
[16:29:48,969]: (ComponentName) Static: (Severity) Information: Connection Status Changed, RSSL_Cons_Connection "Default::Connection_SSLED_RMDS_Dict mkdppsusr01d:14002" State: "Up" StatusCode: "None" StatusText: Connection up
[16:29:48,988]: Received an event for the connection with name 'Connection_SSLED_RMDS_Dict', with type 'RSSL', with state up, associated to the status text 'Connection up' and status code None 
[16:29:48,989]: (ComponentName) Static: (Severity) Warning: Connection "Connection_SSLED_RMDS_Dict" does not support views. View requests will be converted to full item requests for this connection.
[16:29:48,990]: (ComponentName) Static: (Severity) Warning: Connection "Connection_SSLED_RMDS_Dict" does not support Pause/Resume. Pause and Resume requests for this connection will be ignored.
[16:29:48,990]: (ComponentName) Static: (Severity) Warning: Connection "Connection_SSLED_RMDS_Dict" does not support optimized Pause/Resume. Optimized Pause/Resume requests will be converted to individual item Pause/Resume requests for this connection.

Regards.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.