I am implementing a mechanism to extract historical time-series from DSS using the Rest API. The code is written in Java and it interfaces directly with the API via HTTP.
All is going well, but performance is still an issue - I understand the TimeSeriesExtractionRequest operation allows up to 1000 instruments to be requested at a time but I haven't tested the full capabilities in full yet. I started with 20 at a time and increased to 50 per request - by then I started encountering issues to download the extract using the MonitorLocation URL. Most of the times, after 3 attempts my connection (proxy actually) would time out and the 4th attempt would result in a 404 error.
So my question is - what is the best strategy to extract historical data for a large number of securities, using the rest API? Is there a recommended number of instruments per request? I noticed the API does not return a Retry-After indicating the length of time I should wait - do you recommend a specific time I should wait?
On a different note, in some occasions (randomly) I noticed the Monitor URL was being returned with localhost in the hostname instead of hosted.datascopeapi.reuters.com, which was obviously causing my application to fail. Is that a known bug in the platform?
Ok, so I managed to run the application outside of a proxy environment and indeed I was able to wait for longer the 120 seconds in some instances so this does tell me that the timeout issue is happening in our proxy.
My test basically entailed sending 4 batches of 100 instruments and waiting until the extract was ready by polling the Monitor URL. In reality, I did get an error for 3 of the requests but at least I have collected evidence that the proxy has been the culprit for the timeout issue.
What I noticed in general is that you don't necessarily get better performance (i.e faster extraction for a total number of instruments) by requesting more at a time, and given we observed more errors as we increased the number of instruments per request I am thinking that we shall leave it at about 20 and get faster response by waiting less time between polling the Monitor URL and avoiding the 404 when it fails.
I am very interested in both of your failure conditions. Are you aware of the diagnostic aspects of the DSS API? I would like to have you setup some special diagnostic conditions so that we might coordinate your failures to our logs and fully identify these issues. From your description, neither of them are known and documented bugs.
On each of your requests, you can place an HTTP header named "X-Client-Session-Id" with any unique local identifier of your choosing which we will note in all our logs associated with that call. The normal description for it is "a value that is generated by, and of interest to, the client, and is used to tell us what it is they are working on for a group of requests" and so it might be a transaction identifier from your internal system.
But in this case, I want you to use a unique identifier on every call (even each monitor call). If you do have a local system identifier that makes sense to use, you could simply add a running counter to the end of it for each call: XXX-1, XXX-2, XXX-3, ... where XXX is the identifier of your system work.
Then, when the results do not meet your expectation, I need you to log-out these identifiers along with your results so that you can report the specific call you made where the answer was unacceptable.
Thanks @Rick Weyrauch and @Steven McCoy for your suggestions.
I made some changes to my code and set Prefer: Respond-async, wait=1 and also added the unique X-Client-Session-Id to each request so that it can be tracked.
I left my code running with 50 instruments per request and every now and then I still get the errors. When it happens, generally it starts with an error indicating that the server failed to respond (my code attempts 3 times on the same MonitorLocation and eventually the attempts get exhausted or I get a 404 error, presumably indicating that one of the previous attempts downloaded the extract successfully but id ditdn't.
[23/Jun/2016 07:18:45 BST] Monitor Location is: https://hosted.datascopeapi.reuters.com/RestApi/v1/monitor/'0x054eff91ca652703'
[23/Jun/2016 07:19:15 BST] X-Client-Session-Id (attempt #1) => DB-MKDT-LND-7832181036738167956
[23/Jun/2016 07:21:18 BST] Unexpected error requesting extraction URL (attempt #1): The target server failed to respond
[23/Jun/2016 07:21:18 BST] X-Client-Session-Id (attempt #2) => DB-MKDT-LND-5545976923871085103
[23/Jun/2016 07:23:20 BST] Unexpected error requesting extraction URL (attempt #2): The target server failed to respond
[23/Jun/2016 07:23:20 BST] X-Client-Session-Id (attempt #3) => DB-MKDT-LND-5892764125184039040
[23/Jun/2016 07:23:21 BST] Time elapsed waiting for Request: 245.625s
[23/Jun/2016 07:23:21 BST] HTTP/1.1 404 Not Found
[23/Jun/2016 07:33:53 BST] Monitor Location is: https://hosted.datascopeapi.reuters.com/RestApi/v1/monitor/'0x054f003cbb02fe65'
[23/Jun/2016 07:34:23 BST] X-Client-Session-Id (attempt #1) => DB-MKDT-LND-4177124253119362225
[23/Jun/2016 07:36:24 BST] Unexpected error requesting extraction URL (attempt #1): The target server failed to respond
[23/Jun/2016 07:36:24 BST] X-Client-Session-Id (attempt #2) => DB-MKDT-LND-5002716494103885290
[23/Jun/2016 07:38:26 BST] Unexpected error requesting extraction URL (attempt #2): The target server failed to respond
[23/Jun/2016 07:38:26 BST] X-Client-Session-Id (attempt #3) => DB-MKDT-LND-2695544550713596376
[23/Jun/2016 07:40:27 BST] Unexpected error requesting extraction URL (attempt #3): The target server failed to respond
[23/Jun/2016 07:40:27 BST] Failed to fullfill request after 3 attempts
I do get the impression that the server is taking too long to process the request (and ignoring the wait=1 on the Prefer: respond-async) and marking the extract as downloaded, so when I try again I get the 404 error. Is that how it's supposed to work, once you reach the MonitorLocation and the extract is ready, you can only download it once?
From your logs it looks like you wait 30 seconds before attempting to retrieve data from the monitor URL. The time to prepare the data depends on the requested data set size, and on the server load. Try lengthening this polling interval. If this works, you could also consider increasing it incrementally (i.e. each subsequent interval could be longer than the preceding one).
Also, the wait=1 defines an upper bound of only 1 second for the server to process the request. 1 second is very short. Try lengthening this wait time as well.
And yes, to answer your request, once you have reached the MonitorLocation and the extract is ready, you can only download it once.
Looking at your first failure (/RestApi/v1/monitor/'0x054eff91ca652703'), I found that it was successfully retrieved on this request:
Start:2016-06-23 06:19:16.469End: 2016-06-23 06:22:16.922
Using: Apache-HttpClient/4.3.3 (java 1.5)
Request Id: 1117fb9c-7483-49ff-938c-78fec6b90c75
Here is what I believe happened. Due to a bit of a bug in 10.5 (fixed in 10.6), this request took 180 seconds but you only waited 30 seconds. Like Chris noted, you must have a local 30 second timeout configured that limits how long your client will wait for the reply. Normally, your local timeout should be some factor over the server allowed processing time. If you tell the server 30 seconds, you client-side timeout should be like 40 seconds. Just for our 10.5 version, you will need to use longer than normal client side timeout value even while using respond-async with a lower wait value (default of 30 seconds should be fine for most cases).
Thanks Christian and Rick for your input again.
I've done further tests today and increased all possible timeout limits (socked, linger, connection request) on the client side - but the connection is still getting closed.
I noticed there is a pattern which may help in identifying the root cause.
Our code basically attempts to download the extract using the MonitorLocation, always passing wait=1 (I actually increased to 5). Between attempts it waits for 30 seconds. Generally, the requests will take slightly over the wait time, that is around 6 seconds.
At some point - supposedly when the extract is finally ready in DSS then the request takes longer than 5 seconds. When my Extract is done on a small number of instruments per request (say up to 20) this works fine and request is fulfilled in around 70 seconds. However, when I am requesting more than that - say, 50 instruments at a time - the final request to the Monitor Location would take a lot longer. What I noticed is that - at exactly 120 seconds mark, my connection gets closed.
This is interesting - always 120 seconds so I started to search for client and server timeouts and the first thing I found was:
"The default connection timeout for IIS 7.5 is 120 seconds, which means after this time http session will be terminated. When a user visit a page and keeps the page open for indefinite time without any activity, the IIS need to keep the connection alive—this causes IIS to spend computing resources for this connection to keep alive."
I can see through the HTTP Header that DSS is based on IIS 7.5 so I wonder if that's what's happening.
So my question is - would that be possible that DSS is not returning anything for that fist 120 seconds and therefore IIS is closing the client connection?
It will require a DSS server side person to answer your last query.
In the meantime another point might be worth investigating: your original query mentions a proxy. Could that proxy have a connection timeout of 120 seconds ?
I tried to reproduce what you observe with the attached query in Postman, got all the results for 50 RICs at the first attempt to retrieve the monitor URL. If you could give me more details on your request (field list, date range) I could try those out and see if I also get this timeout.
I made a few more tests, experimentally confirming that:
Details of the tests:
Test 1: 5 years of data for 50 instruments. 1 get per minute on the monitor URL. 22nd get lasted approximately 200 seconds, delivered all the data (>1 million lines). 23rd GET delivers a 404.
Test 2: 5 years of data for 50 instruments.1 get per 5 minutes on the monitor URL. 2nd get lasted approximately 180 seconds, delivered all the data. 3rd GET delivers a 404.
Test 3: 10 years of data for 50 instruments.1 get per 4 minutes on the monitor URL. The 3rd get lasted approximately 300 seconds. (>1.6 million lines). 4th GET delivers a 404.
Test 4: 10 years of data for 50 instruments.1 get per 5 minutes on the monitor URL. The 3rd get lasted approximately 260 seconds. (>1.6 million lines).4th GET delivers a 404.
Header for all my requests:
Prefer:respond-async, wait=5 Content-Type:application/json