Extracting TRTH Intraday for SP 500 companies with X-Direct-Download: true

Xiao.Xiao · April 2018

why I request 500 companies' bid/ask/price/volume data for the past year, the server returns 202 and never goes anywhere further than that.

I thought there is no quota (with x-direct-download: true), but maybe there is? How can I retrieve these data?

warat.boonyanit · April 2018

Hi @Xiao.Xiao

Just want to add that you can check the progress of the request with the following end point.

https://hosted.datascopeapi.reuters.com/RestApi/v1/Jobs/Jobs('JobId')

It will give you the progress percentage of the request, which should give you a rough idea of how long you have to wait.

Here is the sample response:

{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#Jobs/$entity",
    "JobId": "0x0625548817cb####",
    "UserId": #######,
    "Status": "Completed",
    "StatusMessage": "Completed in 1st Request, Results Returned",
    "Description": "TickHistoryRawReportTemplate Extraction",
    "ProgressPercentage": 100,
    "CreateDate": "2018-04-20T04:00:10.106Z",
    "StartedDate": "2018-04-20T04:00:10.106Z",
    "CompletionDate": "2018-04-20T04:00:14.430Z",
    "MonitorUrl": "https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractRawResult(ExtractionId='0x0625548817cb####')"
}

Christiaan Meihsl · April 2018

@Xiao.Xiao, what you observe is normal, and has nothing to do with quotas.

When you make an On Demand extraction, you will get a response in 30 seconds (default wait time) or less. The HTTP status of the response can have one of several values. Here are the 2 most common ones:

200 OK happens if the request processing completed in less than 30 seconds. This can only occur for very small requests (and even then it is not guaranteed). Your request is for 500 instruments, there is no chance you get a 200 OK after sending the request.
202 Accepted is the one you are most likely to receive. It means the request was accepted, but processing has not yet completed. The next step is to check the request status by polling it regularly until it returns a 200 OK.

I suggest you look at REST API Tutorial 3, it details the entire workflow. The section on HTTP status 202 is here.

Xiao.Xiao · April 2018

https://community.developers.refinitiv.com/discussion/comment/26461#Comment_26461

Thanks Christiaan. However, I am not able to retrieve any data. I got 202 for 1.5 hours and the program simply ends without any warning/error message or anything...

I was wonder if there is another way to get large amount of data? I am doing 500 companies for the past year right now...

Christiaan Meihsl · April 2018

https://community.developers.refinitiv.com/discussion/comment/26471#Comment_26471

@Xiao.Xiao,

Considering your request is for 500 instruments and 1 year of data, it is possible that it could take longer than 1.5 hours. I suggest you wait longer.

"The program ends": what program are you using ? If you post the source code I could have a look at it.

x-direct-download: true has no influence on the data extraction performance. This parameter means that you will be able to download the resulting data (once it was extracted) from the AWS cloud, which will deliver better download performance.

Xiao.Xiao · April 2018

https://community.developers.refinitiv.com/discussion/comment/26502#Comment_26502

I have waited for the whole afternoon ~3 hours, and I am still getting 202 all the time.

I am considering breaking my request into smaller ones BUT is there any other strategic way that I can do this?

I suppose this is only a few GB of data...

Xiao.Xiao · April 2018

https://community.developers.refinitiv.com/discussion/comment/26502#Comment_26502

... and the program is almost identical to the `.py` example provided by TRTH.

Xiao.Xiao · May 2018

Hi Alex, thanks for asking. I appreciate the replies here, they gives me confidence that there is nothing wrong from my request to TRTH. However, I was waiting forever (hours and hours...) trying to fetch 500 companies' data with status code=202. So I ended up making loops of 50 requests per loop and gets all the data I needed. I am not sure it can be considered as "resolved" in this case.

Christiaan Meihsl · May 2018

https://community.developers.refinitiv.com/discussion/comment/26525#Comment_26525

@Xiao.Xiao, how long do your requests for 50 instruments take to deliver data ? If I were you I'd let a request for 500 run till the end, to see how long it takes, and what volume of data it delivers. If you post your code (remove your account and password first, but ensure it includes the 500 instruments) I can also test it here and investigate.

Xiao.Xiao · May 2018

https://community.developers.refinitiv.com/discussion/comment/26895#Comment_26895

for 50 instruments, it takes 3-4 hours...

Well, I didn't get any data for 500 instruments for more than half a day, and I need to move on with my project...

My `.py` is identical to the `.py` code you provided under example here. The only difference is that I stack 500 instruments according to the user guide (the example shown here only has one instrument)... and I kept getting 202 so I assume my request is valid (?)

Christiaan Meihsl · May 2018

https://community.developers.refinitiv.com/discussion/comment/27248#Comment_27248

@Xiao.Xiao, if the request takes 3-4 hours for 50 instruments, then you will not get 500 in half a day ...

And yes, if you get a 202 your request is valid. If it was not valid you'd get an HTTP status in the 400 range.

Could you post your entire Python code ? I could test it to see if I get the same results as you.

veerapath.rungruengrayubkul · May 2018

https://community.developers.refinitiv.com/discussion/comment/26525#Comment_26525

@Xiao.Xiao, Could you post your entire Python code?

Xiao.Xiao · May 2018

https://community.developers.refinitiv.com/discussion/comment/27716#Comment_27716

trth-ondemand-intradaybars-allpy.zip

Please see the attachment above. I have two local files with all SP500 RICs and all SP400 RICs respectively.

Christiaan Meihsl · May 2018

https://community.developers.refinitiv.com/discussion/comment/28059#Comment_28059

@Xiao.Xiao, please also post the files LIST_SP500.csv and LIST_SP400.csv, I'd like to test with the exact same instrument lists you are using.

Christiaan Meihsl · May 2018

@Xiao.Xiao, let me make a few comments on the code you sent (re-attached for easy reference). It has some important differences compared to our sample available under the downloads tab.

Amount of downloaded data

Your code makes a request for more than a year (~275 business days) of 1 minute bars (1440 records / day) for 900 instruments. That is a big amount of data (~356 million records). It is not surprising that it takes a long time.

X-Direct-Download

One sets this header to request a data download from AWS, which is faster. This header is only useful when put in the request to download the data (i.e. the request that uses the JobId in the endpoint URL). Your code sets this header in getId (line 58 of your code) which has no effect; it should be set in getRaw (line 132). Changing this should enhance download performance (it will not decrease the extraction time).

Saving the data to file

In writeToFile (line 146) your code is:

rr = r5.raw
connector.write_to_s3('', fileName, rr.read())

This is not efficient: it first pulls all data into RAM, and only then writes it to disk. RAM usage is therefore quite high, and performance poor.

A better and faster solution would be:

chunk_size = 1024
rr = r5.raw
with open(fileName, 'wb') as fd:
    shutil.copyfileobj(rr, fd, chunk_size)

This is taken from step 5 of our sample code.

For more info on download tuning with Python

See this article: How to Optimize TRTH (Tick History) file downloads for Python (and other languages).

Hope this helps.

Xiao.Xiao · May 2018

https://community.developers.refinitiv.com/discussion/comment/28105#Comment_28105

Thank for your reponse. I was more concerned about your first and second point, as I have tried different versions for save streaming response to file.

But if I understand you correctly, "It is not surprising that it takes a long time" and "Changing this should enhance download performance (it will not decrease the extraction time).", there is no way to improve the waiting time for the response - which seems "never" come back in my case...

Christiaan Meihsl · May 2018

https://community.developers.refinitiv.com/discussion/comment/28097#Comment_28097

@Xiao.Xiao,

The server performance depends on its load. You cannot influence the server to deliver a faster extraction time.

You can optimize your code to optimize the download time.

You are making a huge request. You say it takes 3-4 hours for 50 RICs. Do the math, for 500 RICs it will take more than 1 day ...

You need to diminish the number of instruments and/or diminish the date range and/or wait the time it takes.

Extracting TRTH Intraday for SP 500 companies with X-Direct-Download: true

Best Answer

Answers

Categories

EXPLORE OUR SITES

Extracting TRTH Intraday for SP 500 companies with X-Direct-Download: true

Best Answer

Answers

Categories