question

Upvotes
Accepted
19 1 1 4

API Pagination in Python Using Refinitiv Data API

Hello,

I'm working on a Python project using the Refinitiv Data API to retrieve large sets of equity quote data. I need to fetch data in chunks of 10,000 records due to API limitations, but I'm struggling with implementing effective pagination to avoid overlapping data without missing any records. The API doesn't seem to support direct offset management.

Here's what I'm trying to do:

  1. Retrieve 10,000 records at a time.
  2. Ensure the next batch of 10,000 starts right after the last record of the previous batch.
  3. I'm using rd.discovery.search method but can't find a way to paginate properly.

Here is a code with offset(I know that search() doesn't have such an argument, it's just an example):


import refinitiv.data as rd
import pandas as pd

def retrieve_data():
    rd.open_session()
    offset = 0
    all_data = pd.DataFrame()

    while True:
        results = rd.discovery.search(
            view=rd.discovery.Views.EQUITY_QUOTES,
            top=10000,
            filter="(RCSAssetCategoryLeaf eq 'Ordinary Share')",
            select="RIC, DTSubjectName, RCSAssetCategoryLeaf, ExchangeName, ExchangeCountry",
            offset=offset  # Adjust the offset for each iteration
        )

        df = pd.DataFrame(results)
        if df.empty:
            break

        all_data = pd.concat([all_data, df], ignore_index=True)
        offset += 10000  # Increase the offset to get the next batch of records


        df.to_csv(f'./Retrieved_RICs_{offset}.csv')

    all_data.to_csv('./Retrieved_RICs_All.csv')
    print("Data retrieval complete.")

    rd.close_session()

retrieve_data()
 

Any advice on managing continuation tokens or other effective methods would be greatly appreciated!

Thank you in advance!

pythonworkspace#technologyrdp-apirfa-apisearchcodebookrefinitiv-data-librariesrdp searchpandas
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
Accepted
17.8k 82 39 63

Hi @vitali

I don't know the policies regarding rate limit and recovery. I did research the API Playground Reference document around this:

1713538975069.png

I would suggest you open a ticket and they will bring in the Search team managing policy around these issues.


1713538975069.png (40.7 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Got it, thank you, @nick.zincone, for your time. I appreciate it.
Upvotes
17.8k 82 39 63

Hi @vitali

Within the Building Search into your Application Workflow Article, there is a section dedicated on Limits. Accompanying that section, there are a number of examples and ways to work around the challenges with limits in your workflow.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Hi again, @nick.zincone. Could you help me with the issue I described bellow?
Upvotes
19 1 1 4

Hi, @nick.zincone! Thank you for your response. I attempted to implement the code as per the guidelines in Article.DataLibrary.Python.Search. However, I encountered an error: Error code 429, which states, {"message":"Too many requests, please try again later."}. Could you please advise on how to proceed?
Code:

import refinitiv.data as rd
from refinitiv.data.content import search
import pandas as pd


rd.open_session()


response = search.Definition(
    view=search.Views.EQUITY_QUOTES,
    filter="RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf eq 'Ordinary Share'",
    top=0,
    navigators="MktCapTotal(type:range, buckets:13)"
).get_data()



market_cap_filter = response.data.raw["Navigators"]["MktCapTotal"]["Buckets"][1]["Filter"]
full_filter = f"RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf eq 'Ordinary Share' and {market_cap_filter}"
print(market_cap_filter)

response = search.Definition(
    view=search.Views.EQUITY_QUOTES,
    filter=full_filter,
    top=10000
).get_data()


print(f"Request resulted in a segment of {response.total} documents.")


rd.close_session()


Output:

---------------------------------------------------------------------------
RDError                                   Traceback (most recent call last)
/tmp/ipykernel_442/2001824659.py in <module>
      6 
      7 # Fetching market cap ranges to use for filters
----> 8 response = search.Definition(
      9     view=search.Views.EQUITY_QUOTES,
     10     filter="RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf eq 'Ordinary Share'",


/opt/conda/lib/python3.8/site-packages/refinitiv/data/delivery/_data/_data_provider_layer.py in get_data(self, session, on_response)
    154             )
    155         on_response and emit_event(on_response, response, self, session)
--> 156         self._check_response(response, session.config)
    157         return response
    158 


/opt/conda/lib/python3.8/site-packages/refinitiv/data/delivery/_data/_data_provider_layer.py in _check_response(self, response, config)
    124 
    125     def _check_response(self, response: Response, config):
--> 126         _check_response(response, config)
    127 
    128     def get_data(


/opt/conda/lib/python3.8/site-packages/refinitiv/data/delivery/_data/_data_provider_layer.py in _check_response(response, config, response_class)
     31 
     32             error.response = response
---> 33             raise error
     34 
     35 


RDError: Error code 429 | {"message":"Too many requests, please try again later."}
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvotes
17.8k 82 39 63

Hi @vitali

This is what I was referring to within the Limits section in that article I linked. The service has guardrails to prevent excessive load. One of the main reasons why users run into this issue is due to asking for too much data continuously. I don't know what your specific requirements are and whether you need to request for large amounts of data, but in most cases if the user places filters around their data requests to limit the load and the amount of data, that usually helps. What clients mistakenly do is request for a massive amount of data then filter it out within their applications - this will ultimately lead to limits being hit.
The only thing I can suggest is that you will need to wait or temper/control your requests unfortunately.

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@nick.zincone, I am currently encountering an issue where I receive the following error message each time I attempt to execute any code:

Error code 429 | {"message":"Too many requests, please try again later."}

It appears that I have reached the daily limit for requests. Could you please confirm if this is the case and advise when the limit will reset, allowing me to submit requests again?

Thank you!

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.