Fewer Observations when downloading data again

Options
Antonio_v
Antonio_v Newcomer
edited June 25 in Eikon Data APIs

Hello everyone,

I am a PhD student and I've been using Refinitiv to download firm-level data for my research. I'm currently facing a problem. I downloaded my main dataset around September 2024. In March 2025, I needed to add another variable, so I re-run the same script to download the data through the Python API. The resulting dataset had around 500 observations less than the first one I downloaded. I got the same result using both the Eikon API and Datastream API. Today, I tried to download the dataset again, and I got another 100 observations less. I guess it might be important to specify that my university provides me access only to the Workspace for Student. I tried to contact the Helpdesk, but they suggested to post my query here given the scale of the data request.

The following is the main line used to retrieve the data through the Python Eikon API:
data_y, err = ek.get_data(instruments=[permID_list[x]], fields=['TR.F.ToTRevenue(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.LaborRelExpnTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.COGSTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.TotAssets(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.TotLiab(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.TotFixedAssetsNet(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.SGATot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.PPEGrossTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.PPENetTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.InvstLT(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)','TR.F.CashCashEquivTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.GrossProfIndPropTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.OpProfBefNonRecurIncExpn(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.IntrExpn(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.DebtTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY,Curn=EUR)', 'TR.F.InvntTot(SDate=2023-03-31,EDate=-13,Period=FY0,Frq=FY)', 'TR.NACEClassification', 'TR.NAICSSectorAllCode', 'TR.NAICSIndustryGroupAllCode', 'TR.TRBCIndustry', 'TR.HeadquartersCountry', 'TR.RegistrationCountry', 'TR.OrgFoundedYear'])
In other words, for each of the permID codes specified in a separate csv file ("permID_list"), I retrieve the following data items: Total Revenue, Total Labour Expenditure, Total COGS, Total Assets, Total Liabilities, Total Fixed Assets, Total SGA, Total PPE Gross, Total PPE Net, Investment Long Term, Total Cash and Cash Equivalents, Total Gross Profits, Total Operating Profits, Interest Expenditure, Total Debt, Total Inventories, NACE classification, NAICS code (2-digit), NAICS code (4-digit), TRBC Industry, Country of Headquarters, Country of Registration, Organisation Founding Year. All the time series variables are requested annually, for a time span covering from 2010 to 2022. 

Is there any way I could recover the data from September 2024? Having as many observations as possible is really important for my research and the completion of my thesis.

Thank you for any help you could provide!

Answers

  • Gurpreet
    Gurpreet admin
    edited June 25

    Hello @Antonio_v

    Can you please provide the list of PermIDs that you are using in this query. Also please note that we recommend using the LSEG Data Library which is the strategic API.

    Furthermore, you can separate the date into another object into the API call. For e.g.

     ek.get_data(instruments=[...], fields=[...], parameters={'SDate':'2020-10-01', 'EDate':'2020-10-05', 'Period': 'FQ0'…})
    
  • Antonio_v
    Antonio_v Newcomer

    Hi @Gurpreet

    Thanks for your reply! I attach here the file with the list of PermIDs. Would using the LSEG Data Library solve my issue? I will try today to check

  • Hello @Antonio_v

    Your instrument list is quite large. Combined with the number of fields being requested, you are certainly running into the limits of data that can be retrieved from the desktop.

    Please see the usage and limits guideline document and try to keep the requested data within the limit.

  • Antonio_v
    Antonio_v Newcomer

    Hello @Gurpreet

    It took me a while to implement all your suggestions. I used the LSEG data library API, and broke down the download in chunks of 300 instruments at each time (giving also 5 seconds break between each instrument in a given loop call).

    However, even in this case I have a dataset with fewer observations. In fact, the data downloaded now has 1000 observations less for each variable with respect to the one downloaded in March. Is there anything else I could do to solve my issue?

    Thanks a lot for your help

  • Hello @Antonio_v

    Can you confirm that the data requested is still within the limits guideline?

    If yes, then please provide the code and instrument list so that I can verify your observations on my end.

  • Antonio_v
    Antonio_v Newcomer

    Hi @Gurpreet

    It seems to be that the data request is within the limits, as:

    • I do one request at a time (it tales definitively more than one second to complete)
    • I request way less than 10000 cells/request (using get_data)
    • I put a timer.wait(5) to have 5 seconds break between requests
    • I broken down the loop for the requests in chunks of 300 instruments

    I might be missing something, but I think I should not be hitting the limits now. I am attaching the code and the instrument list; since I cannot upload the former in its native .ipynb extension, I created a pdf version. Let me know if you prefer any other format. Thank you so much for your support.

    Antonio

  • Jirapongse
    Jirapongse ✭✭✭✭✭

    @Antonio_v

    You may change the underlying platform to RDP by running the following code before opening a session.

    config = ld.get_config()
    config.set_param("apis.data.datagrid.underlying-platform", "rdp")
    
    ld.open_session()
    

    You can also enable the debug log in the library by using the following code before opening a session.

    config = ld.get_config()
    config.set_param("logs.transports.file.enabled", True)
    config.set_param("logs.transports.file.name", "lseg-data-lib.log")
    config.set_param("logs.level", "debug")

    With this code, the library will create a lseg-data-lib.log file. You can check the retrieved response from the log file.