question

Upvotes
Accepted
3 0 0 1

How to Retrieve and Parse Metadata More Efficiently in the News service on RDP API?

Is there a more organized way to retrieve metadata information within the News service?

The current script, rd.content.news.story.Definition("urn:newsml:reuters.com:20240508:nL4N3HB510:3").get_data(), outputs metadata in a format that is difficult to parse.

#technologyrefinitiv-data-platform-librariesnews-filtermetadata api
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

· Write an Answer
Upvotes
Accepted
10.5k 19 6 9

@mohamed.noohaboo Thanks for your question - so you can try the following:

import refinitiv.data as rd
from refinitiv.data.content import news
from IPython.display import HTML
import pandas as pd
import numpy as np
from datetime import datetime,timedelta
import time
rd.open_session()
# get a list of headlines for a particular query

dNow = datetime.now().date()
maxenddate = dNow - timedelta(days=360) #upto months=15
compNews = pd.DataFrame()
riclist = ['VOD.L','HD','MSFT.O'] 

for ric in riclist:
    try:
        cHeadlines = rd.news.get_headlines("R:" + ric + " AND Language:LEN", start= str(dNow),end = str(maxenddate), count = 300)
        cHeadlines['cRIC'] = ric
        if len(compNews):
            compNews = pd.concat([compNews,cHeadlines])
        else:
            compNews = cHeadlines
    except Exception:
        pass
        
compNews = compNews.reset_index()
compNews

1718379900951.png


# For each news headline get story text and topic codes

baseurl = "/data/news/v1/stories/"
fullcodelist = pd.DataFrame()
compNews['storyText'] = str()
compNews['q_codes'] = str()

for i, uri in enumerate(compNews['storyId']):
    request_definition = rd.delivery.endpoint_request.Definition(
        url = baseurl + uri,
        method = rd.delivery.endpoint_request.RequestMethod.GET
    )
    response = request_definition.get_data()
    time.sleep(0.1)
    rawr = response.data.raw
    if 'newsItem' in rawr.keys():
        compNews['storyText'][i] = rawr['newsItem']['contentSet']['inlineData']['$']
        topics = rawr['newsItem']['contentMeta']['subject']
        compNews['q_codes'][i] = [d['_qcode'] for d in topics]
            
compNews

I hope this can help.


1718379900951.png (161.7 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.