question

Upvotes
3 2 1 2

TRTH slow download speed

Hi,

When I download TRTH files, download speed is very slow, between 100 and 300 kB/s. This is particularly annoying with large files which can take more than one hour to download.

I'm having this problem with your website https://hosted.datascope.reuters.com/, also with TRTH API. I had also this problem with TRTH v1, and TRTH v2 did not solve it.

I tried to download TRTH files with 3 different connections, including my personal home connection which is not behind a proxy and which is not restricted (optical fiber 1 gbit/s), and I always have a slow download speed. So the problem is coming from your website and not from my connection.

I asked Reuters customer support by mail but they were not able to answer me.

Please help me with this issue.

Regards,

tick-history-rest-api
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
1 0 0 0

Are you trying to download VBDs or Custom files? Are you downloading through the GUI or through the REST API? The answers differ depending on the circumstances. If you are downloading VBD files through the REST API, you can use the "X-Direct-Download" header. If it is Custom through the REST API, then after August 13 you can use this feature. This will provide you with a direct connection to an AWS repository containing the data and the download speed will be as fast as AWS can provide it.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
13.7k 26 8 12

@RTPKRSQT48, this month I did some download performance tests (in Python), without using the X-Direct-Download: true header. On average, downloading a 323MB file took approximately 5 minutes, i.e. I'm observing ~1MB/s, which is faster than what you observed. I am behind a proxy, but have an excellent Internet connection and bandwidth. From your query I gather the large files you are downloading are between 100 and 300 MB (please correct me if I'm wrong).

Could you share your code, and tell us what type of data you are downloading (VBD or custom files) ?

That might help us to find a solution.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
3 2 1 2

Hi Christiaan,

I'm downloading historical "Time and sales" data.

I'm having this issue with both GUI and API.

When you say you cannot replicate this problem, are you trying with an internal reuters connection, or with a independent connection like a connection from another company, or a home connection? Maybe this will change things.

With the webpage GUI there is no code, I'm just selecting instruments and fields for a report, and I click on "immediate schedule". Then when I try to download I have 300-500 kB/s download speed.

With the API, I'm using your code: the method “RequestRawTransactionsDataAndSaveToGzipFile() » from the C# TRTH example from your website. I did not modify your code, I just added ExtractionsContext.Options.AutomaticDecompression = false. Without the header context.DefaultRequestHeaders.Add("x-direct-download", "true") I have a very slow download rate (like with the GUI ,300-500 kB/s). WITH the header the speed dramatically increases: up to 30 MB/s, so 60 times faster. But there is more: without the header other problems appear: I receive uncompressed CSV files event if "automaticdecompression" is set to false, and in addition files are truncated : 80% of the data is missing for large files.

To conclude, TRTH is completely unusable without the "x-direct-download" flag. This is not really a problem with the API as this flag solves all the problems. But with the GUI I did not find where I can activate this flag. Is it possible to activate this flag also for the GUI?

Regards,

Ugo




icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@RTPKRSQT48, my Python tests were also with T&S data, over the Internet (I work at Thomson Reuters but don't have direct access to TRTH servers) using my office connection. I did not test GUI performance.

Ok, you are using code from the C# example app. Let me test to see what performance I get from here in C#. I'll use the .Net C# Tutorial 5 code (available under the downloads tab), it saves in gzip.

When you say 80% is missing from large files: what is the size of those files (not truncated) ? On this topic, did you see this advisory ?

GUI: I don't believe you can activate X-direct-download.

Upvotes
3 2 1 2

The size of the file I tested was 80 MB when gzipped.

It was not divided in small chunks like suggested in the link of your advisory. And the advisory mentions errors when unzipping files, but the problem I have is that I get an already unzipped file with missing data, although I requested a gzipped file (with parameter "AutomaticDecompression = false"). Just activating X-direct-download solves this problem.

Will X-direct-download be implemented for the GUI in the future ? It would be very useful.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
13.7k 26 8 12

@RTPKRSQT48, I made a test using the .Net C# Tutorial 5 code, changing the request to retrieve 1 year of 1 minute bars for more than 8200 instruments (input file: TRTH_API_1942_input_file.csv), without X-Direct-download.

Result was correctly saved as a gzip file of 3.8GB (containing an 88GB CSV with more than 1200 million lines).

Download performance (from Thomson Reuters, via Internet through our proxy): ~640KB/sec

The essentials of the download code:

extractionsContext.Options.AutomaticDecompression = false;
RawExtractionResult extractionResult = extractionsContext.ExtractRaw(extractionRequest);
DssStreamResponse streamResponse = extractionsContext.GetReadStream(extractionResult);
using (var fileStream = File.Create(dataOutputFile))
    streamResponse.Stream.CopyTo(fileStream);

For the entire code, please refer to the original sample (available under the downloads tab).

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
3 2 1 2

Well, it's working on your computer but not on mine. It is also not working on the computer of my colleague. It's not a big deal, as "x-direct-download" solves the problem for my collegue and myself.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@RTPKRSQT48, I'm happy that using AWS by adding "x-direct-download" solves the issue for you.

But I am still intrigued at why, when not using AWS, I get better performance than you. If you have time and do not mind doing this, could you try running the latest .Net C# Tutorial 5 code (modifying it only to retrieve more data than what it does by default), and compare its download performance with that of your code (based on that of the C# example app) ? It might help us understand if the issue is related to the way it is coded.

Upvotes
19 4 9 9

Hello, I meet the same issue.

I did a manual schedule on the DSS website and got a file of 1.5Go.
When I tried to download it, it has required two hours. Even with a 800Mo/s line... Something weird and unexpected !

My question is : where can I find documentation to download with the API a file generated with the DSS interface.

Thanks for your help

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvote
78.1k 246 52 72
@Jerome Guiot-Dorel


From the ScheduleName, you need to use the following endpoint to the ScheduleId.

GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ScheduleGetByName(ScheduleName='example-eod')
 
HTTP/1.1 200 OK
{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#Schedules/$entity",
    "ScheduleId": "0x0580701ad3ab1f86",
    "Name": "example-eod",
    "OutputFileName": "",
    …
}

Then, after getting the SchduleId, you can use the following endpoints to get the extractions.

  • Extractions/Schedules({id})/CompletedExtractions returns the list of completed extractions for this schedule
  • Extractions/Schedules({id})/LastExtraction returns information about the last completed extraction for this schedule
GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedules('0x0580702a9d0b1f86')/LastExtraction
 
HTTP/1.1 200 OK
{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#ReportExtractions/$entity",
    "ReportExtractionId": "6651428",
    "ScheduleId": "0x0580702a9d0b1f86",
    "Status": "Completed",
    "DetailedStatus": "Done",
    "ExtractionDateUtc": "2016-11-23T20:45:37.476Z",
    "ScheduleName": "example-eod",
    "IsTriggered": false,
    "ExtractionStartUtc": "2016-11-23T20:45:52.000Z",
    "ExtractionEndUtc": "2016-11-23T20:45:56.000Z"
}

After that, from ReportExtractionId, you can get the list of files(Data file and Note file) belong to this extraction report.

GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ReportExtractions('6651453')/Files
 
HTTP/1.1 200 OK
{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#ExtractedFiles",
    "value": [
        {
            "ExtractedFileId": "VjF8fDMzNjIzNTE",
            "ReportExtractionId": "6651453",
            "ScheduleId": "0x05807049631b1f86",
            "FileType": "Full",
            "ExtractedFileName": "9001552.ImmediateNonCustomized.20161123.144755.6651453.x04n05.csv",
            "LastWriteTimeUtc": "2016-11-23T20:47:57.567Z",
            "ContentsExists": true,
            "Size": 122,
           "ReceivedDateUtc": "2016-11-23T20:47:57.567Z"
        },
        {
            "ExtractedFileId": "VjF8fDMzNjIzNTA",
            "ReportExtractionId": "6651453",
            "ScheduleId": "0x05807049631b1f86",
            "FileType": "Note",
            "ExtractedFileName": "9001552.ImmediateNonCustomized.20161123.144755.6651453.x04n05.csv.notes.txt",
            "LastWriteTimeUtc": "2016-11-23T20:47:57.573Z",
            "ContentsExists": true,
            "Size": 1777,
            "ReceivedDateUtc": "2016-11-23T20:47:57.573Z"
        }
    ]
} 

From ExtractedFileId, you can download a file with this endpoint: Extractions/ExtractedFiles({id})/$value.

GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles('VjF8fDMyMDk1NDc2OQ')/$value

There are other endpoints that can be useful, such as ReportExtractionGetCompletedByDateRangeByScheduleId or ReportExtractionGetCompletedByScheduleId. For more information, please refer to API Reference Tree . Please focus on Extractions => Schedule , Extractions => Extraction => ReportExtraction, and Extractions => Extraction =>ExtractedFile endpoints.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.