Historical Timeseries Pricing extraction via DSS Rest API

Ok, so I managed to run the application outside of a proxy environment and indeed I was able to wait for longer the 120 seconds in some instances so this does tell me that the timeout issue is happening in our proxy.

My test basically entailed sending 4 batches of 100 instruments and waiting until the extract was ready by polling the Monitor URL. In reality, I did get an error for 3 of the requests but at least I have collected evidence that the proxy has been the culprit for the timeout issue.

What I noticed in general is that you don't necessarily get better performance (i.e faster extraction for a total number of instruments) by requesting more at a time, and given we observed more errors as we increased the number of instruments per request I am thinking that we shall leave it at about 20 and get faster response by waiting less time between polling the Monitor URL and avoiding the 404 when it fails.

All comments

I am very interested in both of your failure conditions. Are you aware of the diagnostic aspects of the DSS API? I would like to have you setup some special diagnostic conditions so that we might coordinate your failures to our logs and fully identify these issues. From your description, neither of them are known and documented bugs.

On each of your requests, you can place an HTTP header named "X-Client-Session-Id" with any unique local identifier of your choosing which we will note in all our logs associated with that call. The normal description for it is "a value that is generated by, and of interest to, the client,
and is used to tell us what it is they are working on for a group of requests" and so it might be a transaction identifier from your internal system.

But in this case, I want you to use a unique identifier on every call (even each monitor call). If you do have a local system identifier that makes sense to use, you could simply add a running counter to the end of it for each call: XXX-1, XXX-2, XXX-3, ... where XXX is the identifier of your system work.

Then, when the results do not meet your expectation, I need you to log-out these identifiers along with your results so that you can report the specific call you made where the answer was unacceptable.

I should also note, my reply was for the troubled call aspect of your question and does not address the performance issue. These are really two very different requests and maybe should be split into 2 questions.

Steven McCoy

For long running DSS REST calls please ensure you have the "Prefer: respond-async" header sent which will have the infrastructure return HTTP/202 responses to keep your proxy from aggressively closing the connection.

Thanks @Rick Weyrauch and @Steven McCoy for your suggestions.

I made some changes to my code and set Prefer: Respond-async, wait=1 and also added the unique X-Client-Session-Id to each request so that it can be tracked.

I left my code running with 50 instruments per request and every now and then I still get the errors. When it happens, generally it starts with an error indicating that the server failed to respond (my code attempts 3 times on the same MonitorLocation and eventually the attempts get exhausted or I get a 404 error, presumably indicating that one of the previous attempts downloaded the extract successfully but id ditdn't.

Example 1:

[23/Jun/2016 07:18:45 BST] Monitor Location is: https://hosted.datascopeapi.reuters.com/RestApi/v1/monitor/'0x054eff91ca652703'
[23/Jun/2016 07:19:15 BST] X-Client-Session-Id (attempt #1) => DB-MKDT-LND-7832181036738167956
[23/Jun/2016 07:21:18 BST] Unexpected error requesting extraction URL (attempt #1): The target server failed to respond
[23/Jun/2016 07:21:18 BST] X-Client-Session-Id (attempt #2) => DB-MKDT-LND-5545976923871085103
[23/Jun/2016 07:23:20 BST] Unexpected error requesting extraction URL (attempt #2): The target server failed to respond
[23/Jun/2016 07:23:20 BST] X-Client-Session-Id (attempt #3) => DB-MKDT-LND-5892764125184039040
[23/Jun/2016 07:23:21 BST] Time elapsed waiting for Request: 245.625s
[23/Jun/2016 07:23:21 BST] HTTP/1.1 404 Not Found

=========

[23/Jun/2016 07:33:53 BST] Monitor Location is: https://hosted.datascopeapi.reuters.com/RestApi/v1/monitor/'0x054f003cbb02fe65'
[23/Jun/2016 07:34:23 BST] X-Client-Session-Id (attempt #1) => DB-MKDT-LND-4177124253119362225
[23/Jun/2016 07:36:24 BST] Unexpected error requesting extraction URL (attempt #1): The target server failed to respond
[23/Jun/2016 07:36:24 BST] X-Client-Session-Id (attempt #2) => DB-MKDT-LND-5002716494103885290
[23/Jun/2016 07:38:26 BST] Unexpected error requesting extraction URL (attempt #2): The target server failed to respond
[23/Jun/2016 07:38:26 BST] X-Client-Session-Id (attempt #3) => DB-MKDT-LND-2695544550713596376
[23/Jun/2016 07:40:27 BST] Unexpected error requesting extraction URL (attempt #3): The target server failed to respond
[23/Jun/2016 07:40:27 BST] Failed to fullfill request after 3 attempts

I do get the impression that the server is taking too long to process the request (and ignoring the wait=1 on the Prefer: respond-async) and marking the extract as downloaded, so when I try again I get the 404 error. Is that how it's supposed to work, once you reach the MonitorLocation and the extract is ready, you can only download it once?

From your logs it looks like you wait 30 seconds before attempting to retrieve data from the monitor URL. The time to prepare the data depends on the requested data set size, and on the server load. Try lengthening this polling interval. If this works, you could also consider increasing it incrementally (i.e. each subsequent interval could be longer than the preceding one).

Also, the wait=1 defines an upper bound of only 1 second for the server to process the request. 1 second is very short. Try lengthening this wait time as well.

And yes, to answer your request, once you have reached the MonitorLocation and the extract is ready, you can only download it once.

Looking at your first failure (/RestApi/v1/monitor/'0x054eff91ca652703'), I found that it was successfully retrieved on this request:

Start:2016-06-23 06:19:16.469

End: 2016-06-23 06:22:16.922

From: 160.83.36.131

Using: Apache-HttpClient/4.3.3 (java 1.5)

Request Id: 1117fb9c-7483-49ff-938c-78fec6b90c75

ClientSessionId: DB-MKDT-LND-7832181036738167956

Here is what I believe happened. Due to a bit of a bug in 10.5 (fixed in 10.6), this request took 180 seconds but you only waited 30 seconds. Like Chris noted, you must have a local 30 second timeout configured that limits how long your client will wait for the reply. Normally, your local timeout should be some factor over the server allowed processing time. If you tell the server 30 seconds, you client-side timeout should be like 40 seconds. Just for our 10.5 version, you will need to use longer than normal client side timeout value even while using respond-async with a lower wait value (default of 30 seconds should be fine for most cases).

Thanks Christian and Rick for your input again.

I've done further tests today and increased all possible timeout limits (socked, linger, connection request) on the client side - but the connection is still getting closed.

I noticed there is a pattern which may help in identifying the root cause.

Our code basically attempts to download the extract using the MonitorLocation, always passing wait=1 (I actually increased to 5). Between attempts it waits for 30 seconds. Generally, the requests will take slightly over the wait time, that is around 6 seconds.

At some point - supposedly when the extract is finally ready in DSS then the request takes longer than 5 seconds. When my Extract is done on a small number of instruments per request (say up to 20) this works fine and request is fulfilled in around 70 seconds. However, when I am requesting more than that - say, 50 instruments at a time - the final request to the Monitor Location would take a lot longer. What I noticed is that - at exactly 120 seconds mark, my connection gets closed.

This is interesting - always 120 seconds so I started to search for client and server timeouts and the first thing I found was:

"The default connection timeout for IIS 7.5 is 120 seconds, which means after this time http session will be terminated. When a user visit a page and keeps the page open for indefinite time without any activity, the IIS need to keep the connection alive—this causes IIS to spend computing resources for this connection to keep alive."

I can see through the HTTP Header that DSS is based on IIS 7.5 so I wonder if that's what's happening.

So my question is - would that be possible that DSS is not returning anything for that fist 120 seconds and therefore IIS is closing the client connection?

It will require a DSS server side person to answer your last query.

In the meantime another point might be worth investigating: your original query mentions a proxy. Could that proxy have a connection timeout of 120 seconds ?

I tried to reproduce what you observe with the attached query in Postman, got all the results for 50 RICs at the first attempt to retrieve the monitor URL. If you could give me more details on your request (field list, date range) I could try those out and see if I also get this timeout.

Our systems (networking equipment through to IIS) have all been reconfigured to allow for a minimum of 20 minute connections, idle or not. We have many clients that do not use async and so have long connections, and yet receive their results as expected. Our system is most likely running these to conclusion (we have no general complaints on this front).

Async was added over the years as some results sets can take over 20 minutes to generate but that was the maximum we wanted any 1 connection open.

Thanks Rick. I will try to find an environment where I am not behind a proxy to see if this is the culprit for the 120 seconds timeout. I will report back when I have the results.

Hi Christiaan - the main difference here is that I am requesting data for a 10 year period (starting Jan 2005) whereas your test is over a 1 month period - I would expect DSS to take a lot longer to process mine and also the actual extract to be significantly larger and hence take longer to transfer over the network.

As for the proxy timeout - I am specifically setting the socket timeout (for the proxy) to 600 seconds.

Hi Mariano,

I made a few more tests, experimentally confirming that:

You can run many GET requests on the monitor URL.
The 120 second timeout is not on the DSS server:
If the data was received through a GET request on the monitor URL, a subsequent GET request on the same monitor URL will not deliver the results.

Details of the tests:

Test 1: 5 years of data for 50 instruments. 1 get per minute on the monitor URL. 22nd get lasted approximately 200 seconds, delivered all the data (>1 million lines). 23rd GET delivers a 404.

Test 2: 5 years of data for 50 instruments.1 get per 5 minutes on the monitor URL. 2nd get lasted approximately 180 seconds, delivered all the data. 3rd GET delivers a 404.

Test 3: 10 years of data for 50 instruments.1 get per 4 minutes on the monitor URL. The 3rd get lasted approximately 300 seconds. (>1.6 million lines). 4th GET delivers a 404.

Test 4: 10 years of data for 50 instruments.1 get per 5 minutes on the monitor URL. The 3rd get lasted approximately 260 seconds. (>1.6 million lines).4th GET delivers a 404.

Header for all my requests:

Prefer:respond-async, wait=5
Content-Type:application/json

Can you please explain the role of LookBackPeriod in the Condition and also how this can improve the extraction time? At the moment we are not passing anything - say, if I want to get data for the last 5 days or the last 10 years, which value should I use in both case? Can it be used with Start/End dates or they are mutually exclusive?

LookBackPeriod is used if there is a null price on the specified trade date. The most recent price is searched for in the LookBackPeriod, expressed in months (1,3,4,6 or 12 months). Default is 4 months, which is what you have as you did not set the value.

The longer the LookBackPeriod, the longer the extraction time, but the less chances of null values. Null values occur more frequently before the year 2000, less nowadays. You could set it to TPX-1M (1 month) for better performance, with a higher risk of null values. There is a tradeoff to be found here.

Wrong, does not apply here. See below.

As a side note, we would appreciate if you could post new questions in a separate thread instead of as comments, that will help other users find topics of interest. In this specific case we did it, here, adding a few more details. Thank you for your understanding, AHS

I'm a bit surprised you still get errors now that you are not using a proxy (this might hint at internet connection quality issues), but obviously, the larger the data download, the higher the probability that a disconnect can occur.