Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Independence tests for Chain Ladder

See original GitHub issue

It is important that some critical assumptions around chain ladder are tested before applying the method. Thomas Mack suggested some tests, for example in “Measuring the variability of Chain Ladder reserve estimates”, 1997. Below are implementations for tests of correlation among development factors and for the impact of calendar years. I don’t feel confident enough to modify the package code directly but hopefully this can be a useful template.

def developFactorsCorrelation(df):
    # Mack (1997) test for Correlations between Subsequent Development Factors
    # results should be between -.67x and +.67x stdError otherwise too much correlation
    m1=df.rank() # rank the development factors by column
    m2=df.to_numpy(copy=True) #does the same but ignoring the anti-diagonal
    np.fill_diagonal(np.fliplr(m2),np.nan)
    m2=pd.DataFrame(np.roll(m2,1),columns=m1.columns, index=m1.index).iloc[:,1:] #leave out the first column
    m2=m2.rank()
    numerator=((m1-m2) **2).sum(axis=0)
    SpearmanFactor=pd.DataFrame(range(1,len(m1.columns)+1),index=m1.columns, columns=['colNo'])
    I = SpearmanFactor['colNo'].max()+1
    SpearmanFactor['divisor'] = (I-SpearmanFactor['colNo'])**3 - I +SpearmanFactor['colNo'] #(I-k)^3-I+k
    SpearmanFactor['value']= 1-6*numerator.T/SpearmanFactor['divisor']
    SpearmanFactor['weighted'] = SpearmanFactor['value'] * (I-SpearmanFactor['colNo']-1) / (SpearmanFactor[1:-1]['colNo']-1).sum() #weight sum excludes 1 and I
    SpearmanCorr=SpearmanFactor['weighted'].iloc[1:-1].sum() # exlcuding 1st and last elements as not significant
    SpearmanCorrVar = 2/((I-2)*(I-3))
    return SpearmanCorr,SpearmanCorrVar

from scipy.stats import binom
def calendarCorrelation(df, pCritical=.1):
    # Mack (1997) test for calendar year effect
    # A calendar period has impact across developments if the probability of the number of small (or large)
    # development factors in that period occurring randomly is less than pCritical
    # df should have period as the row index, on the assumption that the first anti-diagonal is in relation to the same period (development=0) 
    m1=df.rank() # rank the development factors by column
    med=m1.median(axis=0) # find the median value for each column
    m1large=m1.apply(lambda r: r>med, axis=1) # sets to True those elements in each column which are large (above the median rank)
    m1small=m1.apply(lambda r: r<med, axis=1)
    m2large=m1large.to_numpy(copy=True)
    m2small=m1small.to_numpy(copy=True)
    S=[np.diag(m2small[:,::-1],k).sum() for k in range(min(m2small.shape),-1,-1)] # number of large elements in anti-diagonal (calendar year)
    L=[np.diag(m2large[:,::-1],k).sum() for k in range(min(m2large.shape),-1,-1)] # number of large elements in anti-diagonal (calendar year)
    probs=[binom.pmf(S[i], S[i]+L[i], 0.5)+binom.pmf(L[i], S[i]+L[i], 0.5) for i in range(len(S))] # probability of NOT having too many large or small items in anti-diagonal (calendar year)
    return pd.Series([p<pCritical for p in probs[1:]], index=df.index)

And using the example in the paper

MackEx='''1,2,3,4,5,6,7,8,9
1.6,1.32,1.08,1.15,1.2,1.1,1.033,1,1.01
40.4,1.26,1.98,1.29,1.13,.99,1.043,1.03,
2.6,1.54,1.16,1.16,1.19,1.03,1.026,,
2,1.36,1.35,1.1,1.11,1.04,,,
8.8,1.66,1.4,1.17,1.01,,,,
4.3,1.82,1.11,1.23,,,,,
7.2,2.72,1.12,,,,,,
5.1,1.89,,,,,,,
1.7,,,,,,,,
'''
df=pd.read_csv(StringIO(MackEx), header=0)
df.index = df.index+1 #reindex rows from 1
dfCorr,dfCorrVar = developFactorsCorrelation(df)
print('Development factors correlation is {:.2%}'.format(dfCorr))
print('Factor independence if correlation is in range [{:.2%} to {:.2%}]'.format(-.67*np.sqrt(dfCorrVar),  .67*np.sqrt(dfCorrVar)))
print('Dependence on calendar year:')
print(calendarCorrelation(df))

Result is

Development factors correlation is 6.96%
Factor independence if correlation is in range [-12.66% to 12.66%]
Dependence on calendar year:
1    False
2    False
3    False
4    False
5    False
6    False
7    False
8    False
9    False
dtype: bool

Issue Analytics

State:
Created 3 years ago
Comments:26 (13 by maintainers)

Top GitHub Comments

1reaction

gig67commented, May 16, 2020

Great, so this package leaps-frog R 😃

In respect of my previous

[T[i,j] for i,j in list(zip(n,z))] 
I am sure there must be a pythonic way to do so when z, n are 3-D array !

The answer is probably as simple as

T[n,z]

which gives out very quickly an array with same shape as n and z

0reactions

jbogaardtcommented, May 21, 2020

Thanks for the latest fixes @gig67. I think this one is done now. I’ll push a new release to pypi to make the enhancement available in the official package.

Top Results From Across the Web

TESTING THE ASSUMPTIONS OF AGE-TO-AGE FACTORS

This is a test failed by many development triangles, which means that the chain ladder method is not optimal for those triangles. The...

Development Tutorial — Reserving in Python

The chain ladder method is based on the strong assumptions of independence across origin periods and across valuation periods. Mack developed tests to ......

cyEffTest: Testing for Calendar Year Effect in ChainLadder

One of the three basic assumptions underlying the chain ladder method is the independence of the accident years. The function tests this assumption....

46 CFR § 160.017-27 - Production tests and examination.

... 160.017 - Chain Ladder; § 160.017-27 Production tests and examination. ... Each production test must be conducted or supervised by an independent...

Development tutorial - Jupyter Notebooks Gallery

The Chain Ladder method is based on the strong assumptions of independence across origin years and across valuation years. Mack developed tests to...