Merging historical and forecast dataframe yields duplicates

CT1309 · October 2017

I get duplicate cells when I merge two dataframes: one containing historical data, a second containing forecast data, as follows:

tkrrenew = ['0916.HK', '0958.HK', '0451.HK', '1798.HK']
fdflds = ['CF_NAME', 'TR.EBITDA.rfperiod', 'TR.BasicEpsExclExtraItems.Currency', 'TR.BasicEpsExclExtraItems', 'TR.DpsCommonStock']
param = {'Period': 'FY0', 'SDate': 'FY-2', 'EDate': 'FY0', 'FRQ': 'FY'}
valrenew, err = ek.get_data(tkrrenew, fdflds, param)


frcstfld = ['CF_NAME', 'TR.EPSSmartEst.rfperiod', 'TR.EPSSmartEst', 'TR.DPSMean']
paramfcst = {'Period': 'FY1', 'SDate': '0', 'EDate': '2', 'FRQ': 'FY'}
fcstrenew, err = ek.get_data(tkrrenew, frcstfld, paramfcst)

testrenew = valrenew.merge(fcstrenew, on=['Instrument', 'Name'], how='inner')

The output, after some other commands is as follows:

Apart from .drop_duplicate(subset=['Instrument', 'FY']) which leaves me with:

What can I do to have both historical and forecast data till FY3 in one dataframe. If .join or .append can place the FY1-FY3 in the same column as FY-2 to FY0, that would ideal.

Pls advise.

Alex Putkov.1 · October 2017

Is this what you're looking for?
valrenew.merge(fcstrenew, on=['Instrument', 'Financial Period Relative' ], how='outer')
This will give you

You can then sort the resulting dataframe on Instrument and then Financial Period Relative column to get chronological view grouped by the stock. You can remove duplicate CF_NAME column (CF_NAME_x or CF_NAME_y), or you can drop CF_NAME field from one of the requests.

CT1309 · October 2017

Thanks, Alex & Wasin.

I've rewritten my code such that it now goes as follows:

dffrkFY0 = dffrank[dffrank.FY == 'FY0'] #extract FY0 data
dffrkFY1 = dffrank[dffrank.FY == 'FY-1'] #extract FY-1 data
dffrkFY1.columns = [['Instrument', 'Name', 'FYE', 'RpCrncy','EstEPS','EstDPS','EstBVPS','PxCrncy','Price',\
                     'FXCode', 'FXRate', 'EstPER', 'EstPB', 'EstD_Yld']]

# dffwd is an earlier DataFrame

dffmerge = pd.concat([dffwd, dffrkFY0, dffrkFY1], join='inner')
dffmerge = dffmerge.sort_values(['Instrument', 'FYE'])
dffmerge = dffmerge.reset_index(drop=True)
dffmerge[:15]

Thanks for your effort; much appreciated.

Later on in the code, I re-jig the <Name> such that it doesn't appear at such an odd position.

Merging historical and forecast dataframe yields duplicates

Best Answer

Answers

Categories

EXPLORE OUR SITES

Merging historical and forecast dataframe yields duplicates

Best Answer

Answers

Categories