How to use content-md5 header with curl for validating

samara
samara Explorer

we are trying to validate the downloaded files with file size for the response which is incorrect and suggested to use content-md5 header, but we are not aware of it. let us know how to use this.

Best Answer

  • @samara

    The suggestion could be related to this advisory which is advised to retrieve the checksum and file size from the following HTTP header fields returned in the file download response:

    • x-amz-meta-md5sum for the downloaded file’s MD5 checksum if you downloaded the file directly
      from Amazon Web Services.
    • Content-MD5 for the downloaded file’s MD5 checksum if you downloaded from Tick History.
    • Content-Length for the downloaded file’s size in octets (that is, in eight-bit bytes).

    With regard to package delivery id, I understand that you download VBD files via /StandardExtractions/UserPackageDeliveries('<package delivery id')/$value endpoint. You can use curl with -I option to get only response's header. The MD5 checksum is in Content-MD5 header.

    curl -I -Ls -H "Authorization: Token_xxx" -X GET  "https://hosted.datascopeapi.reuters.com/RestApi/v1/StandardExtractions/UserPackageDeliveries('0x0634b9ef4aeb3026')/$value"
    HTTP/1.1 200 OK
    Cache-Control: no-cache
    Pragma: no-cache
    Content-Length: 3471212
    Content-Type: text/plain
    Content-Encoding: gzip
    Content-MD5: a6740f781a6289b145a59be91b0be6e3
    Expires: -1
    Accept-Ranges: bytes
    Server: Microsoft-IIS/7.5
    Content-Disposition: attachment; filename=CBT-2018-06-06-NORMALIZEDMP-Report-1-of-1.csv.gz

    Once the file is downloaded, you can run md5sum command in Linux to get md5 checksum for the downloaded file and then compare the output with value in the Content-MD5 header.

    #md5sum CBT-2018-06-06-NORMALIZEDMP-Report-1-of-1.csv.gz
    #a6740f781a6289b145a59be91b0be6e3 CBT-2018-06-06-NORMALIZEDMP-Report-1-of-1.csv.gz

    Hope this helps. If this is not what you are looking for, please elaborate.

Answers

  • @samara, we need a bit more info to answer:

    • How are you requesting the data ? Is this a schedule created manually in the GUI, a schedule created using the API, or a custom (On Demand) request through the API ? If this query is not clear, please see this info on Scheduled vs On Demand.
    • How are you downloading the files ? Manually from the GUI, or through the API ?
  • samara
    samara Explorer

    @Christiaan Meihsl

    we are sending custom request through API and downloading the files using API using package delivery id