I've got a few requests scheduled to run daily and generate files which I want to download. I created the requests through the GUI
Ideally I'd like to be able to donwload the extracted files using the file name (because it will remain constant). Otherwise I'd need to obtain the list of all extracted files available with their job id and download all those files using their jobs id (I prefer names as they would remain the same)
I'm using Python with the requests library,.
NOTE: I'm able to obtain some info about my extracted files in "https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles", but I'm not sure if it's the best way and I can't download with the file name
@alvaro.canencia, let me attempt to answer your queries:
Q1. How can I get those 'schedule ids' for my scheduled reports?
A1. There is a call to list all schedules (it returns them all, not only the active ones):
The response includes (among other fields) the ScheduleID. You can run this example in the C# example app (its installation and usage are described in the Quick Start), it is the first one in category Schedule Examples. As far as I know this should only return "Stored and Scheduled", not "On demand".
Q2. With the schedule id I got from the GUI I've tried to use the urls you provide, substituting the '123' with the 'schedule id' from the GUI. None of the urls work. I get the message: "Resource not found for the segment 'Schedule'"
A2. Using Schedule IDs, the workflow is:
a) Check each scheduled extraction status using its ScheduleID (retrieved as explained under A1 above):
Repeat until the returned status is "Completed", then save the returned ReportExtractionId
As your schedules are daily, you do this daily after your schedules should have triggered, at a point in time when you expect them to have completed.
b) Retrieve the corresponding extraction report, using the ReportExtractionId:
Save the returned ExtractedFileId you are interested in, for each file you want. I fully agree with Troy, you should not only download the data file, but also download the Note file, because it contains useful information on the extraction, eventual errors or warnings, etc.
c) Retrieve the files, using their ExtractedFileId:
These steps are all described in detail in REST API Tutorial 12. As you created your instrument lists, templates and schedules manually, you can skip the first steps of that tutorial, and start directly at step Check the extraction status, which corresponds to step a) in this answer.
Alternative using file names
You could choose to ignore the schedules completely, and just look at the extracted files.
You can simply get the latest extracted files extraction for all schedules. The call delivers the file ID, report extraction ID, schedule ID, file type, file name, timestamp, size, etc.
After that you can get the changes to that list (using the delta token returned by the previous call).
Finally, you can proceed to get the files, using the returned file ID(s):
Note you can run the calls I list here under Scheduled Extractions in the C# example app (described in the Quick Start).
There is no "best way", it really depends on your workflow and you own preferences.
Hope this helps.
Yes, you can use "ExtractedFiles" function but it will return all extracted files in the history of the user account and the list may grow very large.
You can also use "Jobs" functions such ashttps://hosted.datascopeapi.reuters.com/RestApi/v1/Jobs/Jobs
TRTH REST API Reference Guide has detailed explanations:
The problem with your "Jobs" requests is that they don't retrieve any of the scheduled reports ( only on demand jobs, which I don't need)
However with "ExtractedFiles" I get scheduled and on demand but as you said the list can grow too large. Any alternative?
Addtionally, is there a way to use the output file name defined in the GUI (which would remain constant)? Or the only way to request the file is with the job id/extraction id/schedule id??
Also, when creating and using the url for downloading with the my id from "Extracted"(https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractRawResult(ExtractionId='2000000001888079')) after 5 mins waiting I get a message
"Job of id '2000000001888079' not found". So it needs 5 mins to reply that something doesn't exist? And the response is incorrect after 5 mins?
You can download the last extraction's file given a schedule id using the following URL:
Hi Troy, thanks for your answer, I'm afraid it doesn't work
1. How can I get those 'schedule ids' for my scheduled reports? I could get one of those ids from the GUI (in Extracted Files => Notes). However for hundreds of ids how can I get them via the API? And I always mean "Stored and Scheduled" reports, not "On demand".
2. With the schedule id I got from the GUI I've tried to use the urls you provide, substituting the '123' with the 'schedule id' from the GUI. None of the urls work. I get the message: "Resource not found for the segment 'Schedule'"
NOTE: even with the 'scheduled ids' I get from https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles and pasting them into your urls I get the same "Resource not found for the segment 'Schedule'" error message