Get Investorpermid as string rather than float

Capture.JPG

I am using the below formula to gather a series of investor names and corresponding IDs for a series of entities:

 df, e = ek.get_data(
['4295894740', '5000045221', '4295867361', '5046707555', '4298459348'],
['TR.InvestorFullName','TR.InvestorFullName.investorpermid'],
{'Sdate':'0D'}
)

This works well in getting the data but unfortunately it returns Investor PermIDs as float types, combined with Nulls.

I tried to solve for the issue in two ways:

1. Converting the column from float to Integer (which is then much easier to transform into String) - this results in errors work due to the presence of Nulls

2. Converting the column to String - this way no errors are returned, and Nulls become strings which can then be manipulated. The problem here though is that those Investor PermIDs become string with a trailing '.0' at the end of it as you can in the attached.


Would you please suggest the most effective turnaround? Ideally it would be great to receive the datapoint in String format as it is in fact to be considered text rather than number. Alternatively, is there a way to exclude Nulls so that the conversion to Integer can work?


Thanks!

Best Answer

  • Jirapongse
    Jirapongse ✭✭✭✭✭
    Answer ✓

    @Giorgio Cozzolino

    I checked and found that the retrieved raw data is a string.

    {'columnHeadersCount': 1,
     'data': [['4295894740', 'The Vanguard Group, Inc.', '4297651992'],
      ['4295894740', 'Norges Bank Investment Management (NBIM)', '5000667881'],
      ['4295894740', 'BlackRock Institutional Trust Company, N.A.', '4296477482'],
      ['4295894740', 'Capital Research Global Investors', '4297954501'],
      ['4295894740', 'BlackRock Advisors (UK) Limited', '5000650246'],
      ['4295894740', 'Capital World Investors', '4297954495'],
    ...

    However, when converting it to the data frame, the Instrument and Investor Perm Id columns are converted to Int64 by the get_data method.

    You can get the raw data by setting the raw_output parameter to True.

    df = ek.get_data(
    ['4295894740', '5000045221', '4295867361', '5046707555', '4298459348'], 
    ['TR.InvestorFullName','TR.InvestorFullName.investorpermid'], 
    {'Sdate':'0D'},raw_output=True
    )

    Then, converting the raw data to the data frame according to your requirements.

Answers

  • Hi @Giorgio Cozzolino

    The support ticket number 10447871 was created on your behalf about the type of investor perm ID data returned.

    Meanwhile, I'd suggest the workaround below

    import pandas as pd
    # convert column to int (to get rid of .0)
    df['Investor Perm Id'] = pd.to_numeric(df['Investor Perm Id'], errors='coerce').fillna(0).astype('int64')
    # then convert it to string
    df['Investor Perm Id'] = df['Investor Perm Id'].astype(str)
    df

    an output output-1.png