How to Get Data from AWS instead of TRTH server using REST API

pj4 · June 2018

I am using attached code to request Tick History Raw data where i have specified "X-Direct-Download":"true", and "X-Client-Session-Id":"Direct AWS" in the request headers, even then the data is getting downloaded from TRTH server and not AWS server.

Could anyone help me identifying what i am doing wrong in the attached code?

Christiaan Meihsl · June 2018

@pj4, I see you are using our TRTH_OnDemand_IntradayBars Python sample, which you have modified. That code will download from AWS, if variable useAws is set to True:

useAws = True

In your code the value for variable useAws is not set, but the variable is used to define if the download is to be from AWS. Please try setting useAws = True at the start of your code.

AWS download: when is the additional header required ?

If you want to download the data from AWS, the request to download it requires an additional header:

"X-Direct-Download":"true"

This header only serves in the call to retrieve the data, it is useless and ignored in all other calls.

Therefore it can be removed from calls r1 (line 4, token request), r2 (line 31, extraction request), r3 (line 71, status poll). The only place where it is useful is in call r5 (line 112, data retrieval request).

See this article for more details on AWS download.

A few other comments on your code

X-Client-Session-Id is required for us to trace specific calls to debug. It has no influence on AWS, and to be useful should have a unique new value every single time it is used.
Why set the user-agent in the headers (lines 113 & 123) ? It is not required.
time.sleep(3) is not required (line 129)

pj4 · June 2018

Thanks for your response. I was actually setting the "useAWS = True" at the start of the code (outside of any function like request data) and now, i have removed the unneccessary lines also as you suggested but still while downloading data, i got the below message in console (which shows data got downloaded from TRTH and not AWS):

"Content response headers (TRTH server): type: text/plain - encoding: gzip"

Further, i have encountered one more issue (might be specific to the instrument) that when i tried to download FID 70 (sttlement data) for Futures "ED" (EuroDollar) from 1 Jan 2000 to 31 May 2018, i got the errors in python console (attached for reference).error-while-downloading-data-for-ed.txt erro-while-extracting-gzip-file.png

The Gzip got created but seems that it is not having complete data by looking its size and also not getting extracted by Zip Genius tool (attached the snapshot of the error)

veerapath.rungruengrayubkul · June 2018

https://community.developers.refinitiv.com/discussion/comment/28941#Comment_28941

Hi @pj4,

The variable in Python is case sensitive. Please ensure that the setting is "useAws = True", not "useAWS = True".

pj4 · June 2018

Yes, i am already using "useAws = True" but still the issue persists

veerapath.rungruengrayubkul · June 2018

https://community.developers.refinitiv.com/discussion/comment/28945#Comment_28945

Can you share the whole code, so I can try the code? Please ensure that you DSS username and password is removed from the code.

Christiaan Meihsl · June 2018

https://community.developers.refinitiv.com/discussion/comment/28941#Comment_28941

@pj4, looking at the last lines of the code you posted, from the if condition, if you see "Content response headers (TRTH server): ...", instead of "Content response headers (AWS server): ..." it means that the value of useAws is false, and probably is when the headers were set for the last call (r5).

pj4 · June 2018

Yes Christiaan, you were right, the useAws is getting set to False before r5, now i am able to get data from AWS.

Thanks for your help in sorting this out.

However, do you any idea about the other issue/error which i got while downloading data for Futures "ED" (details in previous responses)?

Christiaan Meihsl · June 2018

https://community.developers.refinitiv.com/discussion/comment/28948#Comment_28948

@pj4, to test I would need the latest version of your code, with the exact request you made for ED.

pj4 · June 2018

thr-timezonemapped-integrated-dataonly-aws1.txt thr-timezonemapped-integrated-modules-aws1.txt

Attached are the two code files: File ending with "Data Only-aws1" is the main file which call data request function in the other file.

The loop on i in "Data Only" file is on instrument where the first instrument was "ED".

Do let me know should you need any other clarification regarding code since the code is quite clunky. Thanks for your help.

Christiaan Meihsl · June 2018

https://community.developers.refinitiv.com/discussion/comment/28949#Comment_28949

@pj4, 2 files are missing for me to run the code: FilePath&Names.xlsx and TradingHours.csv (those you used when you ran into the error error-while-downloading-data-for-ed.txt erro-while-extracting-gzip-file.png)

pj4 · June 2018

excel-files-used-in-code-content-pasted-in-this-te.txt

Apologies for that. Can'r upload excel files, that's why attached the text file with the same content as i have in excel files.

Hope this helps.

Otherwise, if you could just try to download data for ED (all 120 contracts) from 1 Jan 2000 to 31 May 2018 for FID 70 from Tick History Raw at your end and see if it is working or not?

Christiaan Meihsl · June 2018

@pj4, studying your code, I see that at the end you attempt to uncompress all the data, and save it into a string. Considering the number of lines you mention (more than a million), this is bound to fail.

Please set maxLines = 10 and run it again; does it still fail ?

Christiaan Meihsl · June 2018

https://community.developers.refinitiv.com/discussion/comment/28950#Comment_28950

@pj4, I'm struggling to run your code. Manually recreated CSV and XLS files. Had to guess that paths had to be entered in one of the XLS. Changed lines 375 & 376 of "Modules" code (missing parenthesis).

Now I am stuck: missing file MappingTables.xlsx

Can you post it (zipped) ? Ideally, post all required XLSX and CSV files in a single ZIP.

pj4 · June 2018

filesusedincode.zip

Attached are the files used in code. Apologies for missing some files earlier.

Would try to change the maxLines to 0 and would let you know if it works for ED

pj4 · June 2018

Yes, with maxLines to 10, the code worked for ED. Thanks for the help.

However, i ran the code for multiple instruments in a loop. After running for 3 instruments, the code threw an error while running for the last FID (FID - 825) of the 4th instrument (AUL). Attached is the error message, in case you have any idea about it. Is it linked to the "location" in the request headers?

error-message.txt

warat.boonyanit · June 2018

https://community.developers.refinitiv.com/discussion/comment/29025#Comment_29025

The error suggests that it cannot get "location" from the header.

Have you checked the response status code? Is it 202?

Could you try printing the entire response?

Christiaan Meihsl · June 2018

@pj4, glad we found the issue related to the maxLines parameter.

To the error message:

Our Python sample was updated a month ago, with slightly modified code in section labeled "Step 3" (lines 132 and following), to cater for the (rarely occurring) case where data is returned directly instead of an HTTP status 202 with a location URL. The change also caters for eventual errors. It looks like you started off with the older version of this sample.

In your code, try replacing your lines 339 - 360:

  requestUrl = r2.headers["location"]
  requestHeaders={
      "Prefer":"respond-async",
      "Content-Type":"application/json",
      "Authorization":"token " + token
  }
  r3 = requests.get(requestUrl,headers=requestHeaders)

  while (r3.status_code == 202):
      print ('As we received a 202, we wait 900 seconds, then poll again (until we receive a 200)')
      time.sleep(900)
      r3 = requests.get(requestUrl,headers=requestHeaders)

  if r3.status_code == 200 :
      r3Json = json.loads(r3.text.encode('ascii', 'ignore'))
      jobId = r3Json["JobId"]

  if r3.status_code != 200 :
      print ('An error occured. Try to run this cell again. If it fails, re-run the previous cell.\n')

with the following:

  status_code = r2.status_code
  print ("HTTP status of the response: " + str(status_code))
  if status_code == 202 :
      requestUrl = r2.headers["location"]
      requestHeaders={
          "Prefer":"respond-async",
          "Content-Type":"application/json",
          "Authorization":"token " + token
      }

  while (status_code == 202):
      print ('As we received a 202, we wait 900 seconds, then poll again (until we receive a 200)')
    time.sleep(900)
    r3 = requests.get(requestUrl,headers=requestHeaders)
    status_code = r3.status_code
    print ("HTTP status of the response: " + str(status_code))

  if status_code == 200 :
      r3Json = json.loads(r3.text.encode('ascii', 'ignore'))
      jobId = r3Json["JobId"]

  if status_code != 200 :
      print ('An error occurred. Try to run this cell again. If it fails, re-run the previous cell.\n')

This will make your code more robust, it will cater for rare cases as well. The error you observed should no longer occur.

pj4 · June 2018

Hi Christiaan, the updated you shared worked well. However, while iterating on instruments, on the 4th instrument, i got the attached error while downloading the 2nd field (FID 64) data of that instrument. Surprisingly, the code worked well till the 1st field (FID 70) of the instrument.

Any idea about this "SSL error"?ssl-error.txt

Christiaan Meihsl · June 2018

https://community.developers.refinitiv.com/discussion/comment/29091#Comment_29091

@pj4, are you using Python 2.6, 2.7 or 3 ? This issue seems unrelated to our API, and more to the Python requests library. Please see this thread and especially this (long) thread, which might help. If you post the latest version of your code I could test it here and see if I run into the same error.

pj4 · June 2018

I am using Python 2.7.13. To further add, i got one more technical issue "Connection Aborted" while downloading data. And don't know if it is a coincidence that i got the error while the code was downloading data of the 4th instrument in the list (and exactly at the same FID 64), same number at which i got the Bad Handshake error yesterday. However, the instrument was different on both days as i changed the instrument list.

Attached is the error message that i got it today. connection-aborted-error.txtIs it also related to the request library?

Christiaan Meihsl · June 2018

https://community.developers.refinitiv.com/discussion/comment/29121#Comment_29121

@pj4, the error message was generated by the requests library, but can have many causes as a quick search on the net reveals. I suggest you try to change the order of the instruments and FIDs to see if there is a correlation between the error and a specific instrument, FID on position in the list.

Maybe also upgrade to Python 3, based on the info in the threads I sent in my previous comment ?

pj4 · June 2018

Hi Christiaan, the issue was not linked to the particular instrument or FID sequence as i was able to download many more instrument successfully (but not sure why it happened). However, i am currently stuck due to below issue:

any idea when we get?

"HTTP status of the response: 400"

"local variable 'jobId' referenced before assignment"

Christiaan Meihsl · June 2018

https://community.developers.refinitiv.com/discussion/comment/29166#Comment_29166

@pj4, HTTP status codes returned by our servers are documented on this page. An HTTP status 400 indicates a bad request. Can you post the request that generated the error ?

pj4 · June 2018

attached is the latest request code i am using: data-request.txt. And i am getting this error even when i tried different instruments (LH, FC) to download. Do let me know if you need anything else?

Christiaan Meihsl · July 2018

@pj4,

Analysis of the messages you get:

"HTTP status of the response: 400"
"local variable 'jobId' referenced before assignment"

The r2 request generated an HTTP status 400 (bad request). Due to that, the JobId variable was not assigned, but later on was referred to in request r5.

The HTTP 400 is due to a badly formatted r2 request body. This is probably caused by the content of the XLS parameter files used by the code to generate the request body. To analyze what went wrong here, you need to save the request body, and then we can analyze the one that generated the HTTP 400 error.

Looking at the code I see the final request r5 for data retrieval is made in all cases. This requires correcting, because it should only be done if the preceding HTTP status was 200 and the JobId variable assigned.

I have inserted the latest code you sent into the master code on my PC, and am now running it with a few debugging traces added. If I manage to reproduce the error maybe I'll find the culprit of the 400.

pj4 · July 2018

https://community.developers.refinitiv.com/discussion/comment/29321#Comment_29321

Thanks Christiaan for looking into this. Really appreciate it.

pj4 · July 2018

i always get 400 error while downloading data for base RICs: LH and LC. So, if you could try these RICs at your end.

Christiaan Meihsl · July 2018

https://community.developers.refinitiv.com/discussion/comment/29308#Comment_29308

@pj4, your code is still running on my PC, I cannot change anything now, and considering how long it takes to run, will not be able to change anything more today.

I suggest you edit your code to save each r2 request body, and post the request body that generated the HTTP 400 error, so we can analyze it.

Christiaan Meihsl · July 2018

https://community.developers.refinitiv.com/discussion/comment/29321#Comment_29321

@pj4, as an additional suggestion, considering the time it typically takes to retrieve the type of data you are requesting, you could reduce the polling time from 900 (15 minutes) to 300 (5 minutes). The entire run time of the whole program will benefit from that, without unduly loading the servers.

pj4 · July 2018

Sure Christiaan, i would change the wait time from 900 to 300 sec. I was of the impression that polling to check the request status also load the servers, that's why i changed it to 900 earlier.

Also, by checking the request body of r2, i got to know the issue with LC, LH and FC RICs. So, now i am able to download the data now for it as well. Thank you very much for all your help.

How to Get Data from AWS instead of TRTH server using REST API

Best Answer

Answers

Categories

EXPLORE OUR SITES

How to Get Data from AWS instead of TRTH server using REST API

Best Answer

Answers

Categories