How to clear special character in news extracted from eikon api
Hi team, I encountered a question regarding eikon api retrieving news. The news body contains too many special characters, hyper links as well delimiters. Is there any way to clean them up and only keep the raw text? I've attached my code below and the original news from workspace. Thanks for the help.
Find more posts tagged with
Sort by:
1 - 1 of
11
Sort by:
1 - 1 of
11
Thank you for reaching out to us.
You can use the Refinitiv Data Library for Python instead to get news.
text = rd.news.get_story("urn:newsml:reuters.com:20231121:nHKS3l2gW4:1", format=rd.news.Format.TEXT)
print(text)
With the Refinitiv Data Library for Python, you can specify the news story's format (HTML or TEXT).
The sample code is available on GitHub.
@Julian.Bai
Thank you for reaching out to us.
You can use the Refinitiv Data Library for Python instead to get news.
With the Refinitiv Data Library for Python, you can specify the news story's format (HTML or TEXT).
The sample code is available on GitHub.
Hi Jira, thanks for the reply. The new command did help to cleaned up special characters, but it truncated quite a lot text.
text = rd.news.get_story("urn:newsml:reuters.com:20231113:nL4S3CE14O:1", format=rd.news.Format.TEXT)
print(text)
Original news:
News extracted:
Every line was truncated right in front of a hyper link or RIC. Is that some bugs or any other adjustments I need to do? Thank you.
@Julian.Bai
This is what I get from the API.