ESG data extraction

Hi,

I am extracting multiple ESG metrics for various companies from 2006 to 2024. However, I’m facing an issue: the output structure does not include a date for every company when data is missing. I would like to modify the output so that there is an entry for every date in the range for all companies, regardless of whether data is available.

I have attached my code along with a screenshot showing a snippet of the current output. Could you please assist with this?

Thank you in advance.

Emission Fields

emissions_fields = [
"TR.AnalyticEnvExpenditures.fperiod",
"TR.AnalyticEnvExpenditures",
"TR.PolicyEmissions",
"TR.TargetsEmissions",
"TR.BiodiversityImpactReduction",
"TR.NOxSOxEmissionsReduction",
"TR.VOCEmissionsReduction",
"TR.PMReduction",
"TR.WasteReductionInitiatives",
"TR.eWasteReduction",
"TR.EmissionsTrading",
"TR.EnvPartnerships",
"TR.ISO14000",
"TR.EMSCertifiedPct",
"TR.EnvRestorationInitiatives",
"TR.StaffTransportationReduction",
"TR.ClimateChangeRisksOpp",
"TR.EnvInvestments",
"TR.WasteRecyclingRatio",
"TR.EmissionReductionTargetPctage",
"TR.EmissionsTargetAnnualReduction",
"TR.BiodiversityNetPositiveImpact",
"TR.BiodiversityTargets",
"TR.TargetsPollution",
"TR.TargetsWaste",
"TR.PolicyPollution",
"TR.PolicyWaste",
"TR.AnalyticEnvExpendituresScore",
"TR.PolicyEmissionsScore",
"TR.eWasteReductionScore",
"TR.EmissionsTradingScore",
"TR.EnvPartnershipsScore",
"TR.EMSCertifiedPctScore",
"TR.EnvRestorationInitiativesScore",
"TR.StaffTransportationReductionScore",
"TR.ClimateChangeRisksOppScore",
"TR.BiodiversityCommitment"
]

Extract ESG Data

start_date = '2006-12-31'
end_date = '2024-12-31'
ric = list(historical_constituents['Constituent RIC'].unique())

def get_esg_data(fields, ric):
df = ld.get_data(
universe=ric,
fields=fields,
parameters={'Sdate': start_date, 'Edate': end_date, 'Frq': 'FY', 'Period': 'FY0'},
header_type=ld.HeaderType.TITLE
)
return df

emissions_df = get_esg_data(emissions_fields, ric)
emissions_df

Tagged:

Answers

  • Jirapongse
    Jirapongse ✭✭✭✭✭

    @Pasa

    Thank you for reaching out to us.

    I tested the following code.

    emissions_fields = [
        "TR.AnalyticEnvExpenditures.fperiod",
        "TR.AnalyticEnvExpenditures",
        "TR.PolicyEmissions.fperiod",
        "TR.PolicyEmissions",
        "TR.PolicyEmissionsScore.fperiod",
        "TR.PolicyEmissionsScore"
        ]
    start_date = '2006-12-31'
    end_date = '2024-12-31'
    
    df = ld.get_data(
        universe=["BBBY.OQ^E23","INSW.N"],
        fields=emissions_fields,
        parameters={'Sdate': start_date, 'Edate': end_date, 'Frq': 'FY', 'Period': 'FY0'},
        header_type=ld.HeaderType.TITLE
        )
    
    df
    

    I got this data.

    image.png

    First, you can contact the helpdesk team directly via MyAccount to verify the content.

    As far as I know, the API doesn't have an ability to fill in missing values.

  • Pasa
    Pasa Newcomer

    Hi,

    Thank you for your assistance.

    Just to confirm:

    It is not possible to extract all dates within the specified date range at the given frequency without retrieving the date ("fperiod") for every single data point? I want to limit the data extraction as much as possible :-)

  • Jirapongse
    Jirapongse ✭✭✭✭✭

    API just sends a request to the endpoint and retrieve a response.

    You may need to contact the helpdesk team to confirm this.