For a deeper look into our Eikon Data API, look into:

Overview |  Quickstart |  Documentation |  Downloads |  Tutorials |  Articles

question

Upvotes
Accepted
46 5 5 5

Using for loop to get news headlines of multiple companies

I am trying to use for loop to get ESG news headlines for a list of 405 companies.

My code has proven succesfull when retrieving RICs based on ISINs, however, when I apply the same approach to retrieve the news headlines, I get the error "query must be string ". When I try to use "str()" to convert the values into string, and run the for loop, I get the error below:

EikonError: Error code 500 | Backend error. Failed to deserialize backend response. Expected valid JSON. Error: invalid character 'i' looking for beginning of value

See example below of the instance where my for loop code works with the "ek.get_symbology" function, and how it fails with the "ek.get_news_headlines" function.

Succesful example with ek.get_symbology:

## Ensuring that all ISIN ID's in the Brands data frame are strings (or otherwise converting them)

Brands["ID (ISIN)"] = Brands["ID (ISIN)"].astype(str)

## Creating a list of all the ISINs in order to make the RIC call on the API

ISIN_List = Brands["ID (ISIN)"].tolist()

## Dividing the list of 450 compaies into chunks for the API to be able to process it

chunklist = chunk(ISIN_List, 100)

##Calling RICs from API based on ISIN's (with a for loop to batch the request in chunks)

content_df = []

for subs in chunklist:

RICs = ek.get_symbology(subs, from_symbol_type='ISIN', to_symbol_type= 'RIC')

content_df.append(RICs)

content_df = pd.concat(content_df)

Failed attempt with the "ek.get_news_headlines" function.

##Adding the "R:" for insturment clarification + ESG as a topic for the headline retriever

Brands_M["H_Rqst"] = "R:"+ Brands_M["RIC"].map(str) + " AND Topic:ESG"

## Ensuring that all Headline Request in the Brands data frame are strings (or otherwise converting them)

Brands_M["H_Rqst"] = Brands_M["H_Rqst"].astype(str)

##Creating list of RICs codes with ESG topic to call on the API

H_Q_List = Brands_M["H_Rqst"].tolist()

##Dividing the list into chunks for the API to process it

H_Q_chunklist = chunk(H_Q_List, 50)

##Retreiving headlines data based on RICs + ESG as a Topic

Headlines_df = []

for H_Q_subs in H_Q_chunklist:

News = ek.get_news_headlines(query=H_Q_subs,count=10,date_from='2020-09-01T09:00:00',date_to='2020-09-25T18:00:00')

Headlines_df.append(News)

Headlines_df = pd.concat(Headlines_df)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-dbae4b732937> in <module>
      3 Headlines_df = []
      4 for H_Q_subs in H_Q_chunklist:
----> 5     News = ek.get_news_headlines(query=H_Q_subs,count=10,date_from='2020-09-01T09:00:00',date_to='2020-09-25T18:00:00')
      6     Headlines_df.append(News)
      7 

~/opt/anaconda3/lib/python3.7/site-packages/eikon/news_request.py in get_news_headlines(query, count, date_from, date_to, raw_output, debug)
     97         error_msg = 'query must be a string'
     98         logger.error(error_msg)
---> 99         raise ValueError(error_msg)
    100 
    101     if type(count) is not int:

ValueError: query must be a string


Hope you can help, I suspect it has something to do with the addition of the topic.



eikoneikon-data-apipythonworkspacerefinitiv-dataplatform-eikonworkspace-data-apinewsquery-string
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Upvote
Accepted
7.6k 10 6 8

@Erik77 does this work for you?

df1 = pd.DataFrame()
ricList = ["AAPL.O","IBM.N","TSLA.O"]
for ric in ricList:
    q="R:" + ric + " AND Topic:ESG"
    print(q)
    df = ek.get_news_headlines(q)
    df1 = df1.append(df) 
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Yes it does, thank you!

Upvote
7.6k 10 6 8

@Erik77 I think your issue may have something to do with proper escaping of the query string - please see my answer on this thread. Could you please post a completed example of the query string: eg print("R:"+ Brands_M["RIC"].map(str) + " AND Topic:ESG")

this works for me:

q="R:" + "IBM" + " AND Topic:ESG"
df = ek.get_news_headlines(q)
df 

I hope this can help.


1601546211859.png (216.6 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Thanks for your reply, I am looking into the escaping. In the meantime, see below the print you requested:





Upvotes
7.6k 10 6 8

@Erik77 So this query looks fine it just needs to be escaped itself eg "R:MCD AND Topic:ESG"

Can you try:

Brands_M["H_Rqst"] = Brands_M["H_Rqst"].astype(str).values

icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

This is already the case, when I pass the column as a list, and then into chunks. Yet I still seem to get the error:

Or am I missing something?

@Erik77 - they need to be escaped with " not ' i believe this is the issue.

Upvotes
7.6k 10 6 8

@Erik77 actually I just tried with single quotes and it works for me:


Can you try removing all the other parts of the query eg count and start/end date and try again? Does the below work for you:

q='R:MCD AND Topic:ESG'
df = ek.get_news_headlines(q)
df 

1601549953530.png (228.0 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Hi @jason.ramchandi this does work as a single query. But I don't seem to be able to loop over 405 companies.

Any ideas on how to do that?