News headline and story to CSV file

Koji.Miyamoto

Hi,

I would like to make csv file of news headlines and story.

for headline I’m using→headlines = ek.get_news_headlines('JPY=')

for story I’m using →for index, headline_row in headlines.iterrows():

story = ek.get_news_story(headline_row['StoryId'])

print (story)

then request, df.to_csv('news.csv')

Does anyone know where do I have to fix?

Regards

Find more posts tagged with

python

eikon

news

csv

eap

workspace

eikon-data-api

refinitiv-dataplatform-eikon

workspace-data-api

Accepted answers

Jirapongse

Do you mean adding the Story column in the headlines data frame? If yes, the code is:

headlines = ek.get_news_headlines("R:JPY= IN JAPANESE", count=100, date_from='2018-01-10T13:00:00', date_to='2018-01-10T15:00:00')
stories = pd.DataFrame(columns=['DATE','STORY'])
for index, headline_row in headlines.iterrows():   
    story = ek.get_news_story(headline_row['storyId'])
    stories = stories.append({'DATE':index,'STORY':story}, ignore_index=True)
stories = stories.set_index('DATE')
result = pd.concat([headlines, stories], axis=1)
result.to_csv("news.csv")

The result looks like:

story.png

All comments

Jirapongse

Do you mean adding the Story column in the headlines data frame? If yes, the code is:

headlines = ek.get_news_headlines("R:JPY= IN JAPANESE", count=100, date_from='2018-01-10T13:00:00', date_to='2018-01-10T15:00:00')
stories = pd.DataFrame(columns=['DATE','STORY'])
for index, headline_row in headlines.iterrows():   
    story = ek.get_news_story(headline_row['storyId'])
    stories = stories.append({'DATE':index,'STORY':story}, ignore_index=True)
stories = stories.set_index('DATE')
result = pd.concat([headlines, stories], axis=1)
result.to_csv("news.csv")

The result looks like:

story.png

First, set to lower case StoryId in your code to request a story :
story = ek.get_news_story(headline_row['storyId'])

Then, I understand that you want to save stories with storyId in a csv file.

If I'm correct, the function to_csv you're using comes from DataFrame class.
You have to create the DataFrame based on a story list.
Example:

headlines = ek.get_news_headlines('JPY=')
stories = [ (storyId,ek.get_news_story(storyId)) for storyId in headlines['storyId'].tolist()]
df = pd.DataFrame(stories, columns=['storyId', 'story'])
df.to_csv('news.csv', sep=',',index=False)

Koji.Miyamoto

Thank you for your support.

I have an one more question,the number of news are different between DF and RESULTS.

It's my understanding that RESULTS includes DF thus I can get wider range of news using RESULTS compare with DF. Is this correct?

Sorry but I am very new to Eikon APIs.

Thank you for your kindly support.

Regards,

Koji

Jirapongse

Could you please explain more about the question or share the code?

If you're comparing results from following requests :
headlines = ek.get_news_headlines("R:JPY= IN JAPANESE",...
and
headlines = ek.get_news_headlines('JPY=')

News parameters are different, so number of headlines/stories could be different.

Koji.Miyamoto

I meant former answer uses :

result = pd.concat([headlines, stories], axis=1)

result.to_csv("news.csv")

But latter answer uses :

df = pd.DataFrame(stories, columns=['storyId', 'story'])
df.to_csv('news.csv', sep=',',index=False)

What is the difference between result= and df=?

Jirapongse

As mentioned by pierre.faurel, news parameters are different, so number of headlines/stories could be different.

result uses headlines from ek.get_news_headlines("R:JPY= IN JAPANESE", count=100, date_from='2018-01-10T13:00:00', date_to='2018-01-10T15:00:00') while pd uses headlines from ek.get_news_headlines('JPY=').

Koji.Miyamoto

Sorry for lack of my information,

I meant definitions of result= and df= .

Its my understanding that If I want to contain over 2 columns, I should use results=

then if I want to just 2 columns, use df=.

Is this correct?

Regards,

Koji

Jirapongse

Yes, you are correct.

result in the first sample uses concat to merge two data frames (headlines, stories) based on date which is an index. headlines data frame has the following 5 columns: DATE, versionCreated, text, storyId, and sourceCode while stories data frame has the following 2 column: DATE, and STORY. After merging, the result data frame has 6 column which has DATE as an index.

df in the second sample creates a new data frame with two columns: storyId, and story.

Koji.Miyamoto

Thank you very much!

Your answer is very helpful.

Kind regards,

Koji

EXPLORE OUR SITES