question

Upvotes
Accepted
19 1 1 4

Refinitiv.data proper fields names

Greetings,

I am currently involved in a project aimed at retrieving the closing price for companies that have been delisted due to bankruptcy, specifically the day following their bankruptcy announcement. The primary methodology I employ involves checking whether 'TR.MnAHasBankruptcy' is true. If this condition is met, we proceed to use 'TR.MnAHasBankruptcy.date' to extract the closing price on the subsequent day, utilizing "TR.PriceCloseDate".

I have encountered a challenge in executing the code intended for this purpose. For reference, the dataframe columns available for extraction are as follows: Index(['Instrument', 'Delisted Quote Flag', 'RetireDate', 'Bankruptcy Flag', 'Date', 'Acquisition Techniques', 'Date Effective or Unconditional', 'Price Close', 'Date'], dtype='object'). It is noteworthy that both 'TR.MnAHasBankruptcy.date' and "TR.PriceCloseDate" correspond to the same field name "Date".

I am seeking a solution to address this ambiguity between field names. What strategies could potentially resolve this issue?

Thank you!

import refinitiv.data as rd
import pandas as pd

rd.open_session()
results = rd.discovery.search(
    view=rd.discovery.Views.EQUITY_QUOTES,
    top=500,
    filter="(IssuerOAPermID eq '4295899979' and RCSAssetCategoryLeaf eq 'Ordinary Share')",
    select="RIC, DTSubjectName, RCSAssetCategoryLeaf, ExchangeName, ExchangeCountry"
)

df = results

substrings = ['.OQ', '.O', '.N', '.A']

def is_valid_ric(ric):
    for ending in substrings:
        if ending in ric:
            pos = ric.find(ending)
            if len(ric) == pos + len(ending) or (len(ric) > pos + len(ending) and ric[pos + len(ending)] == '^'):
                return True
    return False

filtered_df = df[df['RIC'].apply(is_valid_ric)]

rics = filtered_df['RIC'].tolist()

df = rd.get_data(
    universe=rics,
    fields=[
        'TR.IsDelistedQuote',
        'DELIST_DAT',
        'TR.RetireDate',
        'TR.MnAHasBankruptcy',
        'TR.MnAHasBankruptcy.date',
        "TR.MnAAcquisitionTechnique(Concat='|')",
        'TR.MnADateUnconditional',
        "TR.PriceClose",
        "TR.PriceCloseDate",
    ],
    parameters={}
)

# Filtering DataFrame where 'TR.MnAHasBankruptcy' is True
bankruptcy_df = df[df['Bankruptcy Flag'] == True].copy()

# Adding a new column for the target date, which is one day after 'TR.MnAHasBankruptcy.date'
bankruptcy_df['TargetDate'] = pd.to_datetime(bankruptcy_df['TR.MnAHasBankruptcy.date']) + pd.Timedelta(days=1)

# Prepare a DataFrame to hold the final results
final_columns = ['RIC', 'TR.MnAHasBankruptcy.date', 'TargetDate',  'Price Close']
final_df = pd.DataFrame(columns=final_columns)

# Iterate through the bankruptcy DataFrame
for index, row in bankruptcy_df.iterrows():
    target_date = row['TargetDate']
    ric = row['RIC']
    # Attempt to find the 'TR.PriceClose' where 'TR.PriceCloseDate' matches 'TargetDate'
    price_close_row = df[(df['RIC'] == ric) & (pd.to_datetime(df['TR.PriceCloseDate']) == target_date)]
    if not price_close_row.empty:
        price_close = price_close_row.iloc[0]['Price Close']
        final_df = final_df.append({
            'RIC': ric,
            'TR.MnAHasBankruptcy.date': row['TR.MnAHasBankruptcy.date'],
            'TargetDate': target_date,
            'Price Close': price_close
        }, ignore_index=True)
    else:
        # Handle the case where no matching 'TR.PriceClose' is found
        final_df = final_df.append({
            'RIC': ric,
            'TR.MnAHasBankruptcy.date': row['TR.MnAHasBankruptcy.date'],
            'TargetDate': target_date,
            'Price Close': "<NA>"
        }, ignore_index=True)

print(final_df)
final_df.to_csv("./bankruptcy_price_close.csv")

rd.close_session()
pythonrefinitiv-dataplatform-eikonworkspace#technologyrfa-apipython apicodebookpandascompany-research
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

· Write an Answer
Upvotes
Accepted
5.9k 21 3 6

Hi @vitali ,


If I understand your request properly, you would simply just like to diferenciate between the two 'Date' columns? Do let me know if I missunderstood.

There are 2 ways that come to mind. The 1st is to use the `use_field_names_in_headers` argument:


df = rd.get_data(
    universe=rics,
    fields=[
        'TR.IsDelistedQuote',
        'DELIST_DAT',
        'TR.RetireDate',
        'TR.MnAHasBankruptcy',
        'TR.MnAHasBankruptcy.date',
        "TR.MnAAcquisitionTechnique(Concat='|')",
        'TR.MnADateUnconditional',
        "TR.PriceClose",
        "TR.PriceCloseDate",
    ],
    parameters={},
    use_field_names_in_headers=True)


1712230075263.png


The 2nd is to rename the columns with


for i in range(len(my_list)):
    if my_list[i] == 'Date':
        my_list[i] = replacement_values[count]
        count += 1
        if count == len(replacement_values):
            break


It's rather ugly but it does the trick:


import refinitiv.data as rd
import pandas as pd
 
rd.open_session()
results = rd.discovery.search(
    view=rd.discovery.Views.EQUITY_QUOTES,
    top=500,
    filter="(IssuerOAPermID eq '4295899979' and RCSAssetCategoryLeaf eq 'Ordinary Share')",
    select="RIC, DTSubjectName, RCSAssetCategoryLeaf, ExchangeName, ExchangeCountry"
)
 
df = results
 
substrings = ['.OQ', '.O', '.N', '.A']
 
def is_valid_ric(ric):
    for ending in substrings:
        if ending in ric:
            pos = ric.find(ending)
            if len(ric) == pos + len(ending) or (len(ric) > pos + len(ending) and ric[pos + len(ending)] == '^'):
                return True
    return False
 
filtered_df = df[df['RIC'].apply(is_valid_ric)]
 
rics = filtered_df['RIC'].tolist()
 
df = rd.get_data(
    universe=rics,
    fields=[
        'TR.IsDelistedQuote',
        'DELIST_DAT',
        'TR.RetireDate',
        'TR.MnAHasBankruptcy',
        'TR.MnAHasBankruptcy.date',
        "TR.MnAAcquisitionTechnique(Concat='|')",
        'TR.MnADateUnconditional',
        "TR.PriceClose",
        "TR.PriceCloseDate",
    ],
    parameters={},)

my_list = df.columns.to_list()

replacement_values = ["MnAHasBankruptcyDate", "PriceCloseDate"]
count = 0

for i in range(len(my_list)):
    if my_list[i] == 'Date':
        my_list[i] = replacement_values[count]
        count += 1
        if count == len(replacement_values):
            break

df.columns = my_list

df


1712230238648.png


1712230075263.png (87.3 KiB)
1712230238648.png (75.1 KiB)
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Greetings, @jonathan.legrand!
Thank you for your prompt response.

Upon reviewing your feedback, I've realized that it doesn't fully address my requirements. To provide some context, my code utilizes specific field names like 'Bankruptcy Flag' and 'Price Close', which are accurately retrieved using the "df.columns" method. However, I've retained the original names 'TR.MnAHasBankruptcy.date' and 'TR.PriceCloseDate' unchanged because both include the term 'Date', leading to potential conflicts. Due to this similarity, crafting an algorithm that distinguishes between these similarly named fields poses a challenge, as it results in errors.

So, if I leave 'TR.MnAHasBankruptcy.date' and 'TR.PriceCloseDate' unchanged I receive an error: screenshot-2024-04-04-at-133755.png


If I replace 'TR.MnAHasBankruptcy.date' and 'TR.PriceCloseDate' with their proper name 'Date', the error is: screenshot-2024-04-04-at-133953.png

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.