check_headlines_eikon.zipDear all,
I am trying to download news headlines using the Python APIs (ek.get_news_headlines) and the following filters:
QUERY 1
(Eurozone OR Austria OR Belgium OR Cyprus OR Germany OR Estonia OR Spain OR Finland OR France OR Greece OR Ireland OR Italy OR Lithuania OR Luxembourg OR Latvia OR Malta OR Netherlands OR Portugal OR Slovenia OR Slovakia OR Europe)
AND
("credit line" OR "credit lines" OR "line of credit" OR "lines of credit" OR "revolving credit" OR "credit facility" OR "credit facilities" OR "drawdown" OR "drawdowns")
Basically, I want to see which news stemming from the Eurozone/Europe refer to certain terms.
One difficulty that I face is that I cannot know which country (or group of countries) a specific headline stems from or refers to. To work around this, I tried to run the following queries:
QUERY 2
(Eurozone) AND ("credit line" OR "credit lines" OR "line of credit" OR "lines of credit" OR "revolving credit" OR "credit facility" OR "credit facilities" OR "standby line" OR "standby lines" OR "drawdown" OR "drawdowns" OR "drew down" OR "drawn down" OR "drawing down" OR "draw down" OR "draws down" OR "undrawn")
(Austria) AND ("credit line" OR "credit lines" OR "line of credit" OR "lines of credit" OR "revolving credit" OR "credit facility" OR "credit facilities" OR "standby line" OR "standby lines" OR "drawdown" OR "drawdowns" OR "drew down" OR "drawn down" OR "drawing down" OR "draw down" OR "draws down" OR "undrawn")
(Belgium) AND ("credit line" OR "credit lines" OR "line of credit" OR "lines of credit" OR "revolving credit" OR "credit facility" OR "credit facilities" OR "standby line" OR "standby lines" OR "drawdown" OR "drawdowns" OR "drew down" OR "drawn down" OR "drawing down" OR "draw down" OR "draws down" OR "undrawn")
……
and so on for all countries specified in QUERY 1 (plus some additional ones like the UK and the Nordics), appending the results of each individual country-query to each other….
At this point, I expected the results of QUERY 1 to be a subset of QUERY 2, as cross-border headlines retrieved with QUERY 2 should be repeated for all reference countries AND because all country-level subqueries of QUERY 2 contain more keywords. Inexplicably, this does not hold.
The code that I use is the following, with the query q corresponding to one of the above:
date = dt.datetime.today().strftime("%Y-%m-%dT%H:%M:%S")
start_date = dt.datetime(2019,1,1,hour=0, minute=0, second=0).strftime("%Y-%m-%dT%H:%M:%S")
prev_date = date
while date > start_date: # While loop needed to work around limit of 100 headlines per query
print('Retrieving headlines up to '+date[:10])
hdlns_dwld = ek.get_news_headlines(query=q, count=100, date_to=date, debug=True)
headlines = pd.concat([hdlns_dwld,headlines], sort=True)
date = min(headlines.index).strftime("%Y-%m-%dT%H:%M:%S")
if prev_date == date:
break
prev_date = date
To give you a better insight into the problem, I provide in the attachment a csv file that contains the merging of the results of the two queries, with the variable origin specifying with which query I retrieved the headline.
Question 1: How is it possible that QUERY 1 retrieves news that were not matched with the results of QUERY 2?
Question 2: Furthermore, it would extremely helpful to have a better way to identify the country(-ies) of origin of one specific headline. Is this somehow possible with the Eikon APIs?
Thank you
Kind regards