How can you get a list of available scheduled extracted files and download them via API by file n...

...ame?
I've got a few requests scheduled to run daily and generate files which I want to download. I created the requests through the GUI
Ideally I'd like to be able to donwload the extracted files using the file name (because it will remain constant). Otherwise I'd need to obtain the list of all extracted files available with their job id and download all those files using their jobs id (I prefer names as they would remain the same)
I'm using Python with the requests library,.
NOTE: I'm able to obtain some info about my extracted files in "https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles", but I'm not sure if it's the best way and I can't download with the file name
Best Answer
-
@alvaro.canencia, let me attempt to answer your queries:
Q1. How can I get those 'schedule ids' for my scheduled reports?
A1. There is
a call to list all schedules (it returns them all, not only the active
ones):GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedules
The response includes (among other fields) the ScheduleID. You can run this example in the C# example app (its installation and usage are described in the Quick Start), it is the first one in category Schedule Examples. As far as I know this should only return "Stored and Scheduled", not "On demand".
Q2. With the schedule id I got from the GUI I've tried to use the urls you provide, substituting the '123' with the 'schedule id' from the GUI. None of the urls work. I get the message: "Resource not found for the segment 'Schedule'"
A2. Using Schedule IDs, the workflow is:
a) Check each scheduled extraction status using its ScheduleID (retrieved as explained under A1 above):
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedules('0x05a2b98d233b3036')/LastExtraction
Repeat until the returned status is "Completed", then save the returned ReportExtractionId
As your schedules are daily, you do this daily after your schedules should have triggered, at a point in time when you expect them to have completed.
b) Retrieve the corresponding extraction report, using the ReportExtractionId:
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ReportExtractions('2000000000197524')/Files
Save the returned ExtractedFileId you are interested in, for each file you want. I fully agree with Troy, you should not only download the data file, but also download the Note file, because it contains useful information on the extraction, eventual errors or warnings, etc.
c) Retrieve the files, using their ExtractedFileId:
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles('VjF8MHgwNWEyYjk5ODRiNmIyZjk2fA')/$value
These steps are all described in detail in REST API Tutorial 12. As you created your instrument lists, templates and schedules manually, you can skip the first steps of that tutorial, and start directly at step Check the extraction status, which corresponds to step a) in this answer.
Alternative using file names
You could choose to ignore the schedules completely, and just look at the extracted files.
You can simply get the latest extracted files extraction for all schedules. The call delivers the file ID, report extraction ID, schedule ID, file type, file name, timestamp, size, etc.
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles
After that you can get the changes to that list (using the delta token returned by the previous call).GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles?$deltatoken='MjAxNy0wOS0wMVQxNTo0MjoxNi4yNzAwMDAwfDB4MDAwMDAwMDAwMDAwMDAwMCwzMTUyMjUyNjg'
Finally, you can proceed to get the files, using the returned file ID(s):
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles('VjF8fDMxNTIyNzYzOQ')/$value
Note you can run the calls I list here under Scheduled Extractions in the C# example app (described in the Quick Start).
Conclusion
There is no "best way", it really depends on your workflow and you own preferences.
Hope this helps.
0
Answers
-
Yes, you can use "ExtractedFiles" function but it will return all extracted files in the history of the user account and the list may grow very large.
You can also use "Jobs" functions such as
https://hosted.datascopeapi.reuters.com/RestApi/v1/Jobs/Jobshttps://hosted.datascopeapi.reuters.com/RestApi/v1/Jobs/JobGetCompleted
https://hosted.datascopeapi.reuters.com/RestApi/v1/Jobs/JobGetActive
TRTH REST API Reference Guide has detailed explanations:
https://hosted.datascopeapi.reuters.com/RestApi.Help/Context/Entity?ctx=Jobs&ent=Job
0 -
Thanks @steven.peng
The problem with your "Jobs" requests is that they don't retrieve any of the scheduled reports ( only on demand jobs, which I don't need)
However with "ExtractedFiles" I get scheduled and on demand but as you said the list can grow too large. Any alternative?
Addtionally, is there a way to use the output file name defined in the GUI (which would remain constant)? Or the only way to request the file is with the job id/extraction id/schedule id??
0 -
Also, when creating and using the url for downloading with the my id from "Extracted"(https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractRawResult(ExtractionId='2000000001888079')) after 5 mins waiting I get a message
"Job of id '2000000001888079' not found". So it needs 5 mins to reply that something doesn't exist? And the response is incorrect after 5 mins?
0 -
You can download the last extraction's file given a schedule id using the following URL:
or:
https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedule('123')/LastExtraction/$value
- The notes file must be downloaded
separately and we do recommend you download that for support purposes.
- If you happen to miss a day, you will
need to follow a different process to catch up.
- You need to be able to handle duplicates in
the event that we do not create a file (unlikely event of a DSS error).
- You should allow adequate time to pass
before checking for file (so as to avoid checking to early).
0 - The notes file must be downloaded
-
Hi Troy, thanks for your answer, I'm afraid it doesn't work
1. How can I get those 'schedule ids' for my scheduled reports? I could get one of those ids from the GUI (in Extracted Files => Notes). However for hundreds of ids how can I get them via the API? And I always mean "Stored and Scheduled" reports, not "On demand".
2. With the schedule id I got from the GUI I've tried to use the urls you provide, substituting the '123' with the 'schedule id' from the GUI. None of the urls work. I get the message: "Resource not found for the segment 'Schedule'"
NOTE: even with the 'scheduled ids' I get from https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles and pasting them into your urls I get the same "Resource not found for the segment 'Schedule'" error message
0 -
Thanks Christiaan, this is what I needed. It works except for getting the status with the ScheduleID (A2a), I always get the following error:
{"error":{"code":"b4e06a6b-ed37-4d84-9082-f61b1d6742c6","message":"Resource not found for the segment 'Schedule'."}}
However maybe I can work with your last suggestion using file names and studying the REST API Tutorial 12
0 -
@alvaro.canencia, glad this helped.
On getting the status of a schedule (A2a) I just re-tested it. I saw I made a typo in my response above, which I have now corrected. It was:
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedule('0x05a2b98d233b3036')/LastExtraction
But the endpoint should be Schedules with an s at the end:
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedules('0x05a2b98d233b3036')/LastExtraction
Apologies for that. Hopefully it should now also work for you.
0
Categories
- All Categories
- 3 Polls
- 6 AHS
- 36 Alpha
- 166 App Studio
- 6 Block Chain
- 4 Bot Platform
- 18 Connected Risk APIs
- 47 Data Fusion
- 34 Data Model Discovery
- 694 Datastream
- 1.5K DSS
- 630 Eikon COM
- 5.2K Eikon Data APIs
- 12 Electronic Trading
- 1 Generic FIX
- 7 Local Bank Node API
- 4 Trading API
- 2.9K Elektron
- 1.4K EMA
- 255 ETA
- 561 WebSocket API
- 39 FX Venues
- 15 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 25 Messenger Bot
- 3 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 60 Open Calais
- 281 Open PermID
- 46 Entity Search
- 2 Org ID
- 1 PAM
- PAM - Logging
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 23 RDMS
- 2K Refinitiv Data Platform
- 732 Refinitiv Data Platform Libraries
- 4 LSEG Due Diligence
- LSEG Due Diligence Portal API
- 4 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.2K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 12 World-Check Customer Risk Screener
- 1K World-Check One
- 46 World-Check One Zero Footprint
- 45 Side by Side Integration API
- 2 Test Space
- 3 Thomson One Smart
- 10 TR Knowledge Graph
- 151 Transactions
- 143 REDI API
- 1.8K TREP APIs
- 4 CAT
- 27 DACS Station
- 121 Open DACS
- 1.1K RFA
- 106 UPA
- 194 TREP Infrastructure
- 229 TRKD
- 918 TRTH
- 5 Velocity Analytics
- 9 Wealth Management Web Services
- 96 Workspace SDK
- 11 Element Framework
- 5 Grid
- 19 World-Check Data File
- 1 Yield Book Analytics
- 48 中文论坛