TRTH slow download speed

Hi,
When I download TRTH files, download speed is very slow, between 100 and 300 kB/s. This is particularly annoying with large files which can take more than one hour to download.
I'm having this problem with your website https://hosted.datascope.reuters.com/, also with TRTH API. I had also this problem with TRTH v1, and TRTH v2 did not solve it.
I tried to download TRTH files with 3 different
connections, including my personal home connection which is not behind a proxy and which is not restricted (optical fiber 1 gbit/s), and I always have a slow download
speed. So the problem is coming from your website and not from my connection.
I asked Reuters customer support by mail but they were not able to answer me.
Please help me with this issue.
Regards,
Answers
-
Are you trying to download VBDs or Custom files? Are you downloading through the GUI or through the REST API? The answers differ depending on the circumstances. If you are downloading VBD files through the REST API, you can use the "X-Direct-Download" header. If it is Custom through the REST API, then after August 13 you can use this feature. This will provide you with a direct connection to an AWS repository containing the data and the download speed will be as fast as AWS can provide it.
0 -
@RTPKRSQT48, this month I did some download performance tests (in Python), without using the X-Direct-Download:
true header. On average, downloading a 323MB file took approximately 5 minutes, i.e. I'm observing ~1MB/s, which is faster than what you observed. I am behind a proxy, but have an excellent Internet connection and bandwidth. From your query I gather the large files you are downloading are between 100 and 300 MB (please correct me if I'm wrong).Could you share your code, and tell us what type of data you are downloading (VBD or custom files) ?
That might help us to find a solution.
0 -
Hi Christiaan,
I'm downloading historical "Time and sales" data.I'm having this issue with both GUI and API.
When you say you cannot replicate this problem, are you trying with an internal reuters connection, or with a independent connection like a connection from another company, or a home connection? Maybe this will change things.
With the webpage GUI there is no code, I'm just selecting instruments and fields for a report, and I click on "immediate schedule". Then when I try to download I have 300-500 kB/s download speed.With the API, I'm using your code: the method “RequestRawTransactionsDataAndSaveToGzipFile() » from the C# TRTH
example from your website. I did not modify your code, I just added ExtractionsContext.Options.AutomaticDecompression
= false. Without the header context.DefaultRequestHeaders.Add("x-direct-download", "true") I have a very slow download rate (like with the GUI ,300-500 kB/s). WITH the header the speed dramatically increases: up to 30 MB/s, so 60 times faster. But there is more: without the header other problems appear: I receive uncompressed CSV files event if "automaticdecompression" is set to false, and in addition
files are truncated : 80% of the data is missing for large files.
To conclude, TRTH is completely unusable without the "x-direct-download" flag. This is not really a problem with the API as this flag solves all the problems. But with the GUI I did not find where I can activate this flag. Is it possible to activate this flag also for the GUI?Regards,
Ugo
0 -
@RTPKRSQT48, my Python tests were also with T&S data, over the Internet (I work at Thomson Reuters but don't have direct access to TRTH servers) using my office connection. I did not test GUI performance.
Ok, you are using code from the C# example app. Let me test to see what performance I get from here in C#. I'll use the .Net C# Tutorial 5 code (available under the downloads tab), it saves in gzip.
When you say 80% is missing from large files: what is the size of those files (not truncated) ? On this topic, did you see this advisory ?
GUI: I don't believe you can activate X-direct-download.
0 -
The size of the file I tested was 80 MB when gzipped.
It was not divided in small chunks like suggested in the link of your advisory. And the advisory mentions errors when unzipping files, but the problem I have is that I get an already unzipped file with missing data, although I requested a gzipped file (with parameter "AutomaticDecompression = false"). Just activating X-direct-download solves this problem.
Will X-direct-download be implemented for the GUI in the future ? It would be very useful.0 -
@RTPKRSQT48, I made a test using the .Net C# Tutorial 5 code, changing the request to retrieve 1 year of 1 minute bars for more than 8200 instruments (input file: TRTH_API_1942_input_file.csv), without X-Direct-download.
Result was correctly saved as a gzip file of 3.8GB (containing an 88GB CSV with more than 1200 million lines).
Download performance (from Thomson Reuters, via Internet through our proxy): ~640KB/sec
The essentials of the download code:
extractionsContext.Options.AutomaticDecompression = false;
RawExtractionResult extractionResult = extractionsContext.ExtractRaw(extractionRequest);
DssStreamResponse streamResponse = extractionsContext.GetReadStream(extractionResult);
using (var fileStream = File.Create(dataOutputFile))
streamResponse.Stream.CopyTo(fileStream);For the entire code, please refer to the original sample (available under the downloads tab).
0 -
Well, it's working on your computer but not on mine. It is also not working on the computer of my colleague. It's not a big deal, as "x-direct-download" solves the problem for my collegue and myself.
0 -
@RTPKRSQT48, I'm happy that using AWS by adding "x-direct-download" solves the issue for you.
But I am still intrigued at why, when not using AWS, I get better performance than you. If you have time and do not mind doing this, could you try running the latest .Net C# Tutorial 5 code (modifying it only to retrieve more data than what it does by default), and compare its download performance with that of your code (based on that of the C# example app) ? It might help us understand if the issue is related to the way it is coded.
0 -
Hello, I meet the same issue.
I did a manual schedule on the DSS website and got a file of 1.5Go.
When I tried to download it, it has required two hours. Even with a 800Mo/s line... Something weird and unexpected !My question is : where can I find documentation to download with the API a file generated with the DSS interface.
Thanks for your help
0 -
@Jerome Guiot-Dorel
From the ScheduleName, you need to use the following endpoint to the ScheduleId.GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ScheduleGetByName(ScheduleName='example-eod')
HTTP/1.1 200 OK
{
"@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#Schedules/$entity",
"ScheduleId": "0x0580701ad3ab1f86",
"Name": "example-eod",
"OutputFileName": "",
…
}Then, after getting the SchduleId, you can use the
following endpoints to get the extractions.- Extractions/Schedules({id})/CompletedExtractions
returns the list of completed extractions for this schedule
- Extractions/Schedules({id})/LastExtraction
returns information about the last completed extraction for this schedule
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedules('0x0580702a9d0b1f86')/LastExtraction
HTTP/1.1 200 OK
{
"@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#ReportExtractions/$entity",
"ReportExtractionId": "6651428",
"ScheduleId": "0x0580702a9d0b1f86",
"Status": "Completed",
"DetailedStatus": "Done",
"ExtractionDateUtc": "2016-11-23T20:45:37.476Z",
"ScheduleName": "example-eod",
"IsTriggered": false,
"ExtractionStartUtc": "2016-11-23T20:45:52.000Z",
"ExtractionEndUtc": "2016-11-23T20:45:56.000Z"
}After that, from ReportExtractionId,
you can get the list of files(Data file and Note file) belong to this
extraction report.GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ReportExtractions('6651453')/Files
HTTP/1.1 200 OK
{
"@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#ExtractedFiles",
"value": [
{
"ExtractedFileId": "VjF8fDMzNjIzNTE",
"ReportExtractionId": "6651453",
"ScheduleId": "0x05807049631b1f86",
"FileType": "Full",
"ExtractedFileName": "9001552.ImmediateNonCustomized.20161123.144755.6651453.x04n05.csv",
"LastWriteTimeUtc": "2016-11-23T20:47:57.567Z",
"ContentsExists": true,
"Size": 122,
"ReceivedDateUtc": "2016-11-23T20:47:57.567Z"
},
{
"ExtractedFileId": "VjF8fDMzNjIzNTA",
"ReportExtractionId": "6651453",
"ScheduleId": "0x05807049631b1f86",
"FileType": "Note",
"ExtractedFileName": "9001552.ImmediateNonCustomized.20161123.144755.6651453.x04n05.csv.notes.txt",
"LastWriteTimeUtc": "2016-11-23T20:47:57.573Z",
"ContentsExists": true,
"Size": 1777,
"ReceivedDateUtc": "2016-11-23T20:47:57.573Z"
}
]
}From ExtractedFileId, you can download a file with this endpoint: Extractions/ExtractedFiles({id})/$value.
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles('VjF8fDMyMDk1NDc2OQ')/$value
There are other endpoints that can be useful, such as ReportExtractionGetCompletedByDateRangeByScheduleId
or ReportExtractionGetCompletedByScheduleId. For more information,
please refer to API Reference Tree .
Please focus on Extractions => Schedule , Extractions =>
Extraction => ReportExtraction, and Extractions =>
Extraction =>ExtractedFile endpoints.0 - Extractions/Schedules({id})/CompletedExtractions
Categories
- All Categories
- 3 Polls
- 6 AHS
- 36 Alpha
- 166 App Studio
- 6 Block Chain
- 4 Bot Platform
- 18 Connected Risk APIs
- 47 Data Fusion
- 34 Data Model Discovery
- 690 Datastream
- 1.4K DSS
- 629 Eikon COM
- 5.2K Eikon Data APIs
- 11 Electronic Trading
- 1 Generic FIX
- 7 Local Bank Node API
- 3 Trading API
- 2.9K Elektron
- 1.4K EMA
- 255 ETA
- 559 WebSocket API
- 39 FX Venues
- 15 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 25 Messenger Bot
- 3 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 60 Open Calais
- 279 Open PermID
- 45 Entity Search
- 2 Org ID
- 1 PAM
- PAM - Logging
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 23 RDMS
- 2K Refinitiv Data Platform
- 716 Refinitiv Data Platform Libraries
- 4 LSEG Due Diligence
- LSEG Due Diligence Portal API
- 4 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.2K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 12 World-Check Customer Risk Screener
- 1K World-Check One
- 46 World-Check One Zero Footprint
- 45 Side by Side Integration API
- 2 Test Space
- 3 Thomson One Smart
- 10 TR Knowledge Graph
- 151 Transactions
- 143 REDI API
- 1.8K TREP APIs
- 4 CAT
- 27 DACS Station
- 121 Open DACS
- 1.1K RFA
- 106 UPA
- 194 TREP Infrastructure
- 229 TRKD
- 918 TRTH
- 5 Velocity Analytics
- 9 Wealth Management Web Services
- 95 Workspace SDK
- 11 Element Framework
- 5 Grid
- 19 World-Check Data File
- 1 Yield Book Analytics
- 48 中文论坛