I have fetched the news story in a dataframe and saved it to a CSV. The data is having HTML tags. Is there a way i can get the story in plain text?
syntax1 = "(Hospital OR Health Center OR Medical center OR health system OR university hospital OR Emergency Department OR Inpatient OR Rehabilitat OR ICU ) AND ( build OR reopen OR construct OR expansion OR upgrade OR develop OR repurpose OR modern )"
df = ek.get_news_headlines(syntax1,100,date_from="2021-03-25T00:00:00", date_to="2021-04-10T00:00:00")
stories = pd.DataFrame(columns=['DATE','STORY'])
for index, headline_row in df.iterrows():
story = ek.get_news_story(headline_row['storyId'])
stories = stories.append({'DATE':index,'STORY':story}, ignore_index=True)
stories = stories.set_index('DATE')
result = pd.concat([df, stories], axis=1)
result.to_csv("news.csv")
The result dataframe looks like this. I want to get rid of the html tags.