Eikon Timeseries limts

Is there a reliable way to get timeseries data via the Eikon Python API? As discussed multiple places on this forum, if the limit of 3000 points is exceeded than extra data points are silently dropped (this limit seems to be approximate as shown below). This leads to very perverse and difficult to detect behaviour, particularly in the case where an extra row is invalid. For example, the following returns data back until 2015-09-04

eikon.get_timeseries(
  ['PD1CSCRKSPOT', 'PD5CSCRKSPOT', 'MEDCSCRKSPOT'],
  ['*'], start_date='1990-01-01', end_date='2019-12-10',
)

CLOSE PD1CSCRKSPOT PD5CSCRKSPOT MEDCSCRKSPOT
Date
2015-09-04 NaN NaN 6.253819
2015-09-07 NaN NaN 5.690739
2015-09-08 NaN NaN 8.134933
2015-09-09 NaN NaN 5.045736
2015-09-10 NaN NaN 5.918317
2015-09-11 NaN NaN 7.658715
2015-09-14 NaN NaN 6.249980
2015-09-15 NaN NaN 6.974118
2015-09-16 NaN NaN 7.591630
2015-09-17 NaN NaN 6.259352
2015-09-18 NaN NaN 5.514481
2015-09-21 NaN NaN 7.870199
... ... ... ...
2019-11-25 13.61 21.07 5.560000
2019-11-26 14.43 21.14 4.990000
2019-11-27 13.36 21.54 4.900000
2019-11-28 13.39 21.78 5.210000
2019-11-29 13.94 22.38 4.700000
2019-12-02 -56.61 18.63 3.650000
2019-12-03 -56.64 16.58 3.050000
2019-12-04 -57.20 17.15 3.210000
2019-12-05 -57.33 15.78 3.220000
2019-12-06 -57.63 16.45 3.210000
2019-12-09 -58.04 16.48 3.040000
2019-12-10 -58.08 16.89 2.880000

[1077 rows x 3 columns]

Whereas adding a column NWECSTOPSPOT without any data changes the history.

eikon.get_timeseries(
['PD1CSCRKSPOT', 'PD5CSCRKSPOT', 'MEDCSCRKSPOT', 'NWECSTOPSPOT'],
['*'], start_date='1990-01-01', end_date='2019-12-10'
)

Error with NWECSTOPSPOT: Interval is not supported
CLOSE PD1CSCRKSPOT PD5CSCRKSPOT MEDCSCRKSPOT
Date
2016-08-31 NaN NaN 2.762915
2016-09-01 NaN NaN 3.401465
2016-09-02 NaN NaN 5.071215
2016-09-05 NaN NaN 2.855879
2016-09-06 NaN NaN 3.544512
2016-09-07 NaN NaN 4.051888
2016-09-08 NaN NaN 4.668280
2016-09-09 NaN NaN 2.592213
2016-09-12 NaN NaN 3.125617
2016-09-13 NaN NaN 4.002966
2016-09-14 NaN NaN 3.328931
2016-09-15 NaN NaN 5.627774
... ... ... ...
2019-11-25 13.61 21.07 5.560000
2019-11-26 14.43 21.14 4.990000
2019-11-27 13.36 21.54 4.900000
2019-11-28 13.39 21.78 5.210000
2019-11-29 13.94 22.38 4.700000
2019-12-02 -56.61 18.63 3.650000
2019-12-03 -56.64 16.58 3.050000
2019-12-04 -57.20 17.15 3.210000
2019-12-05 -57.33 15.78 3.220000
2019-12-06 -57.63 16.45 3.210000
2019-12-09 -58.04 16.48 3.040000
2019-12-10 -58.08 16.89 2.880000

@alex-putkov mentions in https://community.developers.refinitiv.com/questions/17137/my-get-timeseries-call-is-producing-an-error-which.html that get_data can be used instead, however I do not see a way to use this for my case. In addition, as per https://developers.refinitiv.com/eikon-apis/eikon-data-api/docs?content=49692&type=documentation_item this would still be limited to 10K entries?

Best Answer

  • Jirapongse
    Jirapongse ✭✭✭✭✭
    Answer ✓

    @mbert

    There are other APIs that can provide historical data. For example:

    However, each product may have its own limitations.

    You may need to contact Refinitiv Sales or Account team for the solution that meets your requirements.

    To use the get_data method to retrieve close price, the code looks like:

    rics = ['PD1CSCRKSPOT', 'PD5CSCRKSPOT', 'MEDCSCRKSPOT', 'NWECSTOPSPOT']

    data3 = ek.get_data(rics,['TR.ClosePrice.Date','TR.ClosePrice'], {'Sdate':'1990-01-01', 'EDate':'2019-12-17'})

    Then, the data frame could be reformatted with the following code:

    rics = ['PD1CSCRKSPOT', 'PD5CSCRKSPOT', 'MEDCSCRKSPOT', 'NWECSTOPSPOT']

    data3 = ek.get_data(rics,['TR.ClosePrice.Date','TR.ClosePrice'], {'Sdate':'1990-01-01', 'EDate':'2019-12-17'}) 
     
    dfs = dict(tuple(data3[0].groupby('Instrument'))) 
    dfarray = [] 
    for ric, data in dfs.items(): 
       df_tmp = dfs[ric].dropna() 
       df_tmp = df_tmp.drop_duplicates() 
       df_tmp = df_tmp.set_index('Date') 
       df_tmp = df_tmp.drop(['Instrument'], axis=1) 
       df_tmp = df_tmp.rename(columns={"Close Price":ric}) 
       dfarray.append(df_tmp) 
     
    result = pd.concat(dfarray, axis=1, sort=True) 
    result.columns.name = 'CLOSE' 
    result

    The output is:

    image

Answers

  • Jirapongse
    Jirapongse ✭✭✭✭✭

    @mbert

    From my test, it divides the maximum list (3000) by the number of items to calculate the data points returned for each item. For example, if there are four items in the list, the number of data points for each item will not exceed 750.

    For the list ['PD1CSCRKSPOT', 'PD5CSCRKSPOT', 'MEDCSCRKSPOT', 'NWECSTOPSPOT'], I set the raw_output to True to get the raw output and count the number of data points for each item.

    image

    The number of data points for each item is 750 although the last item doesn't have data.

    With this method, you will approximately know the number of data points returned for each item when requesting the time-series data.

  • mbert
    mbert Newcomer

    Thanks @jirapongse.phuriphanvichai. Would you happen to know if there is a more reliable way to obtain this data? Either through the Eikon Python API or another Reuters service? The fact that this service silently drops data depending on query size is very error prone and leads to downstream issues in the reliability of analysis.

    Is there a way to get all series using the get_data() query? While this still has limts, 10K data points is let restrictive. As indicated here https://community.developers.refinitiv.com/questions/17137/my-get-timeseries-call-is-producing-an-error-which.html it appears possible in certain cases, but is this broadly applicable?

  • mbert
    mbert Newcomer

    Thanks @jirapongse.phuriphanvichai. Following up on this, I am a bit confused why get_data does not limit the amount of data returned? Looking at a query like

    rics = ['PD1CSCRKSPOT', 'PD5CSCRKSPOT', 'MEDCSCRKSPOT']
    eikon.get_data(rics, ['TR.ClosePrice.Date', 'TR.ClosePrice'], {'Sdate': '1990-01-01', 'EDate': '2019-12-17'})

    ( Instrument Date Close Price
    0 PD1CSCRKSPOT 1990-01-01T00:00:00Z NaN
    1 PD1CSCRKSPOT 1990-01-02T00:00:00Z NaN
    2 PD1CSCRKSPOT 1990-01-03T00:00:00Z NaN
    3 PD1CSCRKSPOT 1990-01-04T00:00:00Z NaN
    4 PD1CSCRKSPOT 1990-01-05T00:00:00Z NaN
    5 PD1CSCRKSPOT 1990-01-08T00:00:00Z NaN
    6 PD1CSCRKSPOT 1990-01-09T00:00:00Z NaN
    7 PD1CSCRKSPOT 1990-01-10T00:00:00Z NaN
    8 PD1CSCRKSPOT 1990-01-11T00:00:00Z NaN
    9 PD1CSCRKSPOT 1990-01-12T00:00:00Z NaN
    10 PD1CSCRKSPOT 1990-01-15T00:00:00Z NaN
    11 PD1CSCRKSPOT 1990-01-16T00:00:00Z NaN
    ... ... ... ...
    23175 MEDCSCRKSPOT 2019-12-02T00:00:00Z 3.65
    23176 MEDCSCRKSPOT 2019-12-03T00:00:00Z 3.05
    23177 MEDCSCRKSPOT 2019-12-04T00:00:00Z 3.21
    23178 MEDCSCRKSPOT 2019-12-05T00:00:00Z 3.22
    23179 MEDCSCRKSPOT 2019-12-06T00:00:00Z 3.21
    23180 MEDCSCRKSPOT 2019-12-09T00:00:00Z 3.04
    23181 MEDCSCRKSPOT 2019-12-10T00:00:00Z 2.88
    23182 MEDCSCRKSPOT 2019-12-11T00:00:00Z 3.16
    23183 MEDCSCRKSPOT 2019-12-12T00:00:00Z 2.51
    23184 MEDCSCRKSPOT 2019-12-13T00:00:00Z 2.97
    23185 MEDCSCRKSPOT 2019-12-16T00:00:00Z 4.78
    23186 MEDCSCRKSPOT 2019-12-17T00:00:00Z NaN

    [23187 rows x 3 columns], None)

    this does not appear to be limiting the response to < 10,000 points as per the documentation from https://developers.refinitiv.com/eikon-apis/eikon-data-api/docs?content=49692&type=documentation_item, i.e.

    Datapoints returned per request - A datapoint is a 'cell', or a unique field value for a unique instrument on a unique time stamp. Datapoint limits vary by the content set being retrieved (for example, timeseries limits are different from news headline limits), but all are throttled on a per request basis and are not aggregated across all applications. Here are Datapoint limit examples per Eikon Data API function type

    get_data: The current limit value (10-Oct-2019) is around 10,000 data points.

    In addition, I'm wondering what the best way to get associated mappings from get_timeseries fields to get_data fields? e.g. in the above case knowing that TR.ClosePrice corresponds to CLOSE.

    Lastly, I'm wondering if there is some better documentation somewhere on the DataGrid_StandardAsync and TimeSeries services that get_data and get_timeseries respectively rely on. For example, in your above get_data query at first glance it looks like data was returned for the entire range according to a weekday calendar

    {"responses":[{"columnHeadersCount":1,"data":[["PD1CSCRKSPOT","1990-12-10T00:00:00Z",""],["PD1CSCRKSPOT","1990-12-11T00:00:00Z",""],["PD1CSCRKSPOT","1990-12-12T00:00:00Z",""],["PD1CSCRKSPOT","1990-12-13T00:00:00Z",""],["PD1CSCRKSPOT","1990-12-14T00:00:00Z",""],["PD1CSCRKSPOT","1990-12-17T00:00:00Z",""],...

    but I can't see any documentation confirming this?

  • Jirapongse
    Jirapongse ✭✭✭✭✭

    Yes, you are correct. From the output, the number of data points returned by the get_data method is more than 10,000 data points. I will contact the product team to verify it.

    To find fields that can be used with the get_data method, you can use the Data Item Browser (DIB) tool in Eikon or Formula Builder in Eikon Excel. Otherwise, you may directly contact the Eikon support team to find fields for you.

    All documents for Eikon Data APIs are available on the documentation page. If you need more information, please contact the Eikon support team. I think that the get_timeseries method is similar to the RHistory function in Eikon Excel but the get_timeseries method can only retrieve the default TimeSeries view as in the Common category of RHistory. The get_data method is similar to the TR function in Eikon Excel.

  • mbert
    mbert Newcomer

    @jirapongse.phuriphanvichai thanks, did you ever get any more info on this?

  • Jirapongse
    Jirapongse ✭✭✭✭✭

    @mbert

    I got a response from the development team that this limit isn't applied in the latest version. We will update the guideline.