A 'get_news_story' request into a dataframe

Hi, I am using ek.get_news_headlines to display a dataframe of 5 news articles for a particular company. i.e.
df = ek.get_news_headlines('GOOG.O AND Language:LEN', date_from='2021-01-01T09:00:00', date_to='2023-06-30T23:59:59', count = 5)
The above works fine at display the last 5 storyId's... but i'd like to use the ek.get_news_story request to loop through the rows in the above df and pull the article from each storyID into another dataframe? When I try the below snippet - which I found on another post - I just get a HTML dump from the first storyId only.
for idx, storyId in enumerate(headlines['storyId'].values): #for each row in our df dataframe
newsText = ek.get_news_story(storyId) #get the news story
time.sleep(5) # sleep for 5 seconds
print(newsText)
I'd ideally like to see 1 new dataframe containing 5 rows (one row for each news article), one column with the news article's title, another column containing just the text from each article (no HTML tags!), and then another column of the URL.
Any help would be greatly appreciated.
Thank you!
Best Answer
-
Thank you for reaching out to us.
To get the story text (no HTML tag), you need to use Refinitiv Data Library for Python. The example code is avaiable on GitHub.
The code looks like this:
import time
import pandas as pd
df = pd.DataFrame(columns=['headline', 'story', 'storyid'])
headlines = rd.news.get_headlines('GOOG.O AND Language:LEN',
start='2021-01-01T09:00:00',
end='2023-06-30T23:59:59',
count = 5)
for index, row in headlines.iterrows():
newsText = rd.news.get_story(row['storyId'], format=rd.news.Format.TEXT) #get the news story
df = df.append({'headline':row['headline'],'story':newsText,'storyid':row['storyId']}, ignore_index=True)
time.sleep(5)
dfThe ouput is:
0
Answers
-
Thank you, this worked. Any idea of how I can include a column for the timestamp of each article too?
0 -
Please this one:
import time
import pandas as pd
df = pd.DataFrame(columns=['timestamp','headline', 'story', 'storyid'])
headlines = rd.news.get_headlines('GOOG.O AND Language:LEN',
start='2021-01-01T09:00:00',
end='2023-06-30T23:59:59',
count = 5)
headlines = headlines.reset_index()
for index, row in headlines.iterrows():
newsText = rd.news.get_story(row['storyId'], format=rd.news.Format.TEXT) #get the news story
df = df.append({'timestamp':row['versionCreated'],'headline':row['headline'],'story':newsText,'storyid':row['storyId']}, ignore_index=True)
time.sleep(5)
df0 -
thank you @Jirapongse, this was exactly what i was looking for!
One last question please re: this topic
Is it possible to do a freeform search as part of this news query? i.e. if I wanted to pull news articles into a data frame where "Elon Musk SpaceX" was my search term?Thank you!
0 -
Yes, you can use the free text search.
df = ek.get_news_headlines(query='\\"Elon Musk SpaceX\\"', count=100)
df0 -
@Jirapongse, another question please - how would I run the same query by using the company's PermID instead of the "TSLA.O" code? Some of the company's in my search are not publicly traded. Thank you!Hi0
-
0
Categories
- All Categories
- 3 Polls
- 6 AHS
- 36 Alpha
- 166 App Studio
- 6 Block Chain
- 4 Bot Platform
- 18 Connected Risk APIs
- 47 Data Fusion
- 34 Data Model Discovery
- 685 Datastream
- 1.4K DSS
- 616 Eikon COM
- 5.2K Eikon Data APIs
- 10 Electronic Trading
- Generic FIX
- 7 Local Bank Node API
- 3 Trading API
- 2.9K Elektron
- 1.4K EMA
- 252 ETA
- 556 WebSocket API
- 38 FX Venues
- 14 FX Market Data
- 1 FX Post Trade
- 1 FX Trading - Matching
- 12 FX Trading – RFQ Maker
- 5 Intelligent Tagging
- 2 Legal One
- 23 Messenger Bot
- 3 Messenger Side by Side
- 9 ONESOURCE
- 7 Indirect Tax
- 60 Open Calais
- 275 Open PermID
- 44 Entity Search
- 2 Org ID
- 1 PAM
- PAM - Logging
- 6 Product Insight
- Project Tracking
- ProView
- ProView Internal
- 22 RDMS
- 1.9K Refinitiv Data Platform
- 652 Refinitiv Data Platform Libraries
- 4 LSEG Due Diligence
- LSEG Due Diligence Portal API
- 4 Refinitiv Due Dilligence Centre
- Rose's Space
- 1.2K Screening
- 18 Qual-ID API
- 13 Screening Deployed
- 23 Screening Online
- 12 World-Check Customer Risk Screener
- 1K World-Check One
- 46 World-Check One Zero Footprint
- 45 Side by Side Integration API
- 2 Test Space
- 3 Thomson One Smart
- 10 TR Knowledge Graph
- 151 Transactions
- 143 REDI API
- 1.8K TREP APIs
- 4 CAT
- 27 DACS Station
- 121 Open DACS
- 1.1K RFA
- 104 UPA
- 193 TREP Infrastructure
- 228 TRKD
- 917 TRTH
- 5 Velocity Analytics
- 9 Wealth Management Web Services
- 90 Workspace SDK
- 11 Element Framework
- 5 Grid
- 18 World-Check Data File
- 1 Yield Book Analytics
- 46 中文论坛