How to make sure content length matches the downloaded files from Tick History API?

I am looking at an extraction request based upon:
{ "ExtractionRequest": { "@odata.type": "#DataScope.Select.Api.Extractions.ExtractionRequests.TickHistoryRawExtractionRequest", "IdentifierList": { "@odata.type": "#DataScope.Select.Api.Extractions.ExtractionRequests.InstrumentIdentifierList", "InstrumentIdentifiers": [ { "Identifier": "ESH25", "IdentifierType": "Ric" }, { "Identifier": "ESM25", "IdentifierType": "Ric" }, { "Identifier": "ESU25", "IdentifierType": "Ric" } ], "ValidationOptions": { "AllowHistoricalInstruments": true }, "UseUserPreferencesForValidationOptions": false }, "Condition": { "MessageTimeStampIn": "LocalExchangeTime", "ReportDateRangeType": "Range", "QueryStartDate": "2025-01-16T17:00:00-06:00", "QueryEndDate": "2025-01-17T16:00:00-06:00", "DisplaySourceRIC": true } } }
then download extracted file using the jobid, the header for the request is
{ "Content-Type": "application/json", "Accept-Encoding": "gzip" }
Is this the correct header? Or should Accept-Encoding go in the original extraction request?
So I am expecting a gzip file, is that correct?
The value of:
int(response.headers['Content-Length'])
Does not match the actual file size, do you know why?
I am using the "Content-length" to validate what has been downloaded. If the file has been downloaded correctly its file size should in theory match the content length. Can you confirm if my understanding is correct?
An alternative is to use the "Content-MD5" but this appears to be missing from the response headers. Do you know how I obtain this, then I can validate against the checksum?
Answers
-
Hello @prasad.reddy01
The content length in response header refers to the number of bytes, a receiver should expect to receive on the stream. If you count the size of the read-data in the code, you would see that this size matches the content length header exactly.
Once the data is written to the file, the file size would be different. See this answer for an explanation.
The response for the final extraction result is a gzip encoded stream, so your accept header is ok - but you can also leave it to accept all.
Accept: */*
There is no MD5 checksum provided for raw extraction results. MD5 is only available in the Venue-By-Day (VBD) file downloads.
1 -
Hello Gurpreet
I can see:
'Content-Length': '244279206', 'Content-Type': 'text/plain', 'Content-Encoding': 'gzip',But summing the bytes of the response is much larger than the content length:
for chunk in response.iter_content(chunk_size=chunk_size):
if chunk:
chunk_size = len(chunk)
total_bytes += chunk_sizeie total_bytes is much larger than Content-Length
even though the Content-Encoding suggests gzip, the content is uncompressed. Is the Content-Length referring to the compressed bytes?
0 -
actually iter_content decompresses the content, looking at the raw byte stream it matches the content length
0
Categories
- All Categories
- 3 Polls
- 6 AHS
- 36 Alpha
- 166 App Studio
- 6 Block Chain
- 4 Bot Platform
- 18 Connected Risk APIs
- 47 Data Fusion
- 34 Data Model Discovery
- 687 Datastream
- 1.4K DSS
- 622 Eikon COM
- 5.2K Eikon Data APIs
- 10 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- 3 Trading API
- 2.9K Elektron
- 1.4K EMA
- 254 ETA
- 557 WebSocket API
- 38 FX Venues
- 14 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 23 Messenger Bot
- 3 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 60 Open Calais
- 276 Open PermID
- 44 Entity Search
- 2 Org ID
- 1 PAM
- PAM - Logging
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 22 RDMS
- 1.9K Refinitiv Data Platform
- 672 Refinitiv Data Platform Libraries
- 4 LSEG Due Diligence
- LSEG Due Diligence Portal API
- 4 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.2K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 12 World-Check Customer Risk Screener
- 1K World-Check One
- 46 World-Check One Zero Footprint
- 45 Side by Side Integration API
- 2 Test Space
- 3 Thomson One Smart
- 10 TR Knowledge Graph
- 151 Transactions
- 143 REDI API
- 1.8K TREP APIs
- 4 CAT
- 27 DACS Station
- 121 Open DACS
- 1.1K RFA
- 104 UPA
- 193 TREP Infrastructure
- 229 TRKD
- 918 TRTH
- 5 Velocity Analytics
- 9 Wealth Management Web Services
- 90 Workspace SDK
- 11 Element Framework
- 5 Grid
- 18 World-Check Data File
- 1 Yield Book Analytics
- 48 中文论坛