What is the cleanest way to get the RIC data out from the get_news_story return value?
Can you share a beautiful soup or json parser snippet for isolating the RICs?
Hello @mjg ,
Try:
storyResult = ek.get_news_story(story_id, True, True)storyResult['story']['storyHtml']
from bs4 import BeautifulSoupsoup = BeautifulSoup(storyResult['story']['storyHtml'])for a in soup.find_all('a', href=True): if a['data-type'] == 'ric': print("Found:", a['data-type'], a['data-ric'])
On my side, this results in:
thanks - i am not a beautiful soup expert so having that snippet is helpful. how would i know for sure, besides inspecting manually or asking you, that all RIC data would be in the "<a href" blocks of the return value?
Hello @mjg,
I don't believe that this level of detail is discussed in EDAPI Reference Guide as, in my opinion, the raw version of the story that is made available is intended for presentation, display, layout, often as HTML, but it is not ideally suited for the task of finding what companies the story is about.
To take the question wider, to know with certainly, what companies/instruments the story is mainly about, RDP News product is much better targeted, as each story comes categorized with rich matadata, including Subjects list, including RICs and Permids, that the story is about.
To take this one step further, and to obtain the calculated relevance of the subject to the story, as well as other news analytics, can be done with News Analytics product.
Hope that this information is of help