TRTH slow download speed

RTPKRSQT48 · July 2017

Hi,

When I download TRTH files, download speed is very slow, between 100 and 300 kB/s. This is particularly annoying with large files which can take more than one hour to download.

I'm having this problem with your website https://hosted.datascope.reuters.com/, also with TRTH API. I had also this problem with TRTH v1, and TRTH v2 did not solve it.

I tried to download TRTH files with 3 different
connections, including my personal home connection which is not behind a proxy and which is not restricted (optical fiber 1 gbit/s), and I always have a slow download
speed. So the problem is coming from your website and not from my connection.

I asked Reuters customer support by mail but they were not able to answer me.

Please help me with this issue.

Regards,

Andy Meltzer · July 2017

Are you trying to download VBDs or Custom files? Are you downloading through the GUI or through the REST API? The answers differ depending on the circumstances. If you are downloading VBD files through the REST API, you can use the "X-Direct-Download" header. If it is Custom through the REST API, then after August 13 you can use this feature. This will provide you with a direct connection to an AWS repository containing the data and the download speed will be as fast as AWS can provide it.

Christiaan Meihsl · August 2017

@RTPKRSQT48, this month I did some download performance tests (in Python), without using the X-Direct-Download:
true header. On average, downloading a 323MB file took approximately 5 minutes, i.e. I'm observing ~1MB/s, which is faster than what you observed. I am behind a proxy, but have an excellent Internet connection and bandwidth. From your query I gather the large files you are downloading are between 100 and 300 MB (please correct me if I'm wrong).

Could you share your code, and tell us what type of data you are downloading (VBD or custom files) ?

That might help us to find a solution.

RTPKRSQT48 · August 2017

Hi Christiaan,

I'm downloading historical "Time and sales" data.

I'm having this issue with both GUI and API.

When you say you cannot replicate this problem, are you trying with an internal reuters connection, or with a independent connection like a connection from another company, or a home connection? Maybe this will change things.

With the webpage GUI there is no code, I'm just selecting instruments and fields for a report, and I click on "immediate schedule". Then when I try to download I have 300-500 kB/s download speed.

With the API, I'm using your code: the method “RequestRawTransactionsDataAndSaveToGzipFile() » from the C# TRTH
example from your website. I did not modify your code, I just added ExtractionsContext.Options.AutomaticDecompression
= false. Without the header context.DefaultRequestHeaders.Add("x-direct-download", "true") I have a very slow download rate (like with the GUI ,300-500 kB/s). WITH the header the speed dramatically increases: up to 30 MB/s, so 60 times faster. But there is more: without the header other problems appear: I receive uncompressed CSV files event if "automaticdecompression" is set to false, and in addition
files are truncated : 80% of the data is missing for large files.

To conclude, TRTH is completely unusable without the "x-direct-download" flag. This is not really a problem with the API as this flag solves all the problems. But with the GUI I did not find where I can activate this flag. Is it possible to activate this flag also for the GUI?

Regards,

Ugo

Christiaan Meihsl · August 2017

https://community.developers.refinitiv.com/discussion/comment/17767#Comment_17767

@RTPKRSQT48, my Python tests were also with T&S data, over the Internet (I work at Thomson Reuters but don't have direct access to TRTH servers) using my office connection. I did not test GUI performance.

Ok, you are using code from the C# example app. Let me test to see what performance I get from here in C#. I'll use the .Net C# Tutorial 5 code (available under the downloads tab), it saves in gzip.

When you say 80% is missing from large files: what is the size of those files (not truncated) ? On this topic, did you see this advisory ?

GUI: I don't believe you can activate X-direct-download.

RTPKRSQT48 · August 2017

The size of the file I tested was 80 MB when gzipped.

It was not divided in small chunks like suggested in the link of your advisory. And the advisory mentions errors when unzipping files, but the problem I have is that I get an already unzipped file with missing data, although I requested a gzipped file (with parameter "AutomaticDecompression = false"). Just activating X-direct-download solves this problem.

Will X-direct-download be implemented for the GUI in the future ? It would be very useful.

Christiaan Meihsl · August 2017

@RTPKRSQT48, I made a test using the .Net C# Tutorial 5 code, changing the request to retrieve 1 year of 1 minute bars for more than 8200 instruments (input file: TRTH_API_1942_input_file.csv), without X-Direct-download.

Result was correctly saved as a gzip file of 3.8GB (containing an 88GB CSV with more than 1200 million lines).

Download performance (from Thomson Reuters, via Internet through our proxy): ~640KB/sec

The essentials of the download code:

extractionsContext.Options.AutomaticDecompression = false;
RawExtractionResult extractionResult = extractionsContext.ExtractRaw(extractionRequest);
DssStreamResponse streamResponse = extractionsContext.GetReadStream(extractionResult);
using (var fileStream = File.Create(dataOutputFile))
    streamResponse.Stream.CopyTo(fileStream);

For the entire code, please refer to the original sample (available under the downloads tab).

RTPKRSQT48 · August 2017

Well, it's working on your computer but not on mine. It is also not working on the computer of my colleague. It's not a big deal, as "x-direct-download" solves the problem for my collegue and myself.

Christiaan Meihsl · August 2017

https://community.developers.refinitiv.com/discussion/comment/17905#Comment_17905

@RTPKRSQT48, I'm happy that using AWS by adding "x-direct-download" solves the issue for you.

But I am still intrigued at why, when not using AWS, I get better performance than you. If you have time and do not mind doing this, could you try running the latest .Net C# Tutorial 5 code (modifying it only to retrieve more data than what it does by default), and compare its download performance with that of your code (based on that of the C# example app) ? It might help us understand if the issue is related to the way it is coded.

Jerome Guiot-Dorel · October 2017

Hello, I meet the same issue.

I did a manual schedule on the DSS website and got a file of 1.5Go.
When I tried to download it, it has required two hours. Even with a 800Mo/s line... Something weird and unexpected !

My question is : where can I find documentation to download with the API a file generated with the DSS interface.

Thanks for your help

Jirapongse · November 2017

@Jerome Guiot-Dorel

From the ScheduleName, you need to use the following endpoint to the ScheduleId.

GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ScheduleGetByName(ScheduleName='example-eod')

HTTP/1.1 200 OK
{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#Schedules/$entity",
    "ScheduleId": "0x0580701ad3ab1f86",
    "Name": "example-eod",
    "OutputFileName": "",
    …
}

Then, after getting the SchduleId, you can use the
following endpoints to get the extractions.

Extractions/Schedules({id})/CompletedExtractions
returns the list of completed extractions for this schedule

Extractions/Schedules({id})/LastExtraction
returns information about the last completed extraction for this schedule

GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Schedules('0x0580702a9d0b1f86')/LastExtraction

HTTP/1.1 200 OK
{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#ReportExtractions/$entity",
    "ReportExtractionId": "6651428",
    "ScheduleId": "0x0580702a9d0b1f86",
    "Status": "Completed",
    "DetailedStatus": "Done",
    "ExtractionDateUtc": "2016-11-23T20:45:37.476Z",
    "ScheduleName": "example-eod",
    "IsTriggered": false,
    "ExtractionStartUtc": "2016-11-23T20:45:52.000Z",
    "ExtractionEndUtc": "2016-11-23T20:45:56.000Z"
}

After that, from ReportExtractionId,
you can get the list of files(Data file and Note file) belong to this
extraction report.

GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ReportExtractions('6651453')/Files

HTTP/1.1 200 OK
{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#ExtractedFiles",
    "value": [
        {
            "ExtractedFileId": "VjF8fDMzNjIzNTE",
            "ReportExtractionId": "6651453",
            "ScheduleId": "0x05807049631b1f86",
            "FileType": "Full",
            "ExtractedFileName": "9001552.ImmediateNonCustomized.20161123.144755.6651453.x04n05.csv",
            "LastWriteTimeUtc": "2016-11-23T20:47:57.567Z",
            "ContentsExists": true,
            "Size": 122,
           "ReceivedDateUtc": "2016-11-23T20:47:57.567Z"
        },
        {
            "ExtractedFileId": "VjF8fDMzNjIzNTA",
            "ReportExtractionId": "6651453",
            "ScheduleId": "0x05807049631b1f86",
            "FileType": "Note",
            "ExtractedFileName": "9001552.ImmediateNonCustomized.20161123.144755.6651453.x04n05.csv.notes.txt",
            "LastWriteTimeUtc": "2016-11-23T20:47:57.573Z",
            "ContentsExists": true,
            "Size": 1777,
            "ReceivedDateUtc": "2016-11-23T20:47:57.573Z"
        }
    ]
}

From ExtractedFileId, you can download a file with this endpoint: Extractions/ExtractedFiles({id})/$value.

GET https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractedFiles('VjF8fDMyMDk1NDc2OQ')/$value

There are other endpoints that can be useful, such as ReportExtractionGetCompletedByDateRangeByScheduleId
or ReportExtractionGetCompletedByScheduleId. For more information,
please refer to API Reference Tree .
Please focus on Extractions => Schedule , Extractions =>
Extraction => ReportExtraction, and Extractions =>
Extraction =>ExtractedFile endpoints.

TRTH slow download speed

Answers

Categories

EXPLORE OUR SITES

TRTH slow download speed

Answers

Categories