# Independence tests for Chain Ladder

See original GitHub issueIt is important that some critical assumptions around chain ladder are tested before applying the method. Thomas Mack suggested some tests, for example in “Measuring the variability of Chain Ladder reserve estimates”, 1997. Below are implementations for tests of correlation among development factors and for the impact of calendar years. I don’t feel confident enough to modify the package code directly but hopefully this can be a useful template.

```
def developFactorsCorrelation(df):
# Mack (1997) test for Correlations between Subsequent Development Factors
# results should be between -.67x and +.67x stdError otherwise too much correlation
m1=df.rank() # rank the development factors by column
m2=df.to_numpy(copy=True) #does the same but ignoring the anti-diagonal
np.fill_diagonal(np.fliplr(m2),np.nan)
m2=pd.DataFrame(np.roll(m2,1),columns=m1.columns, index=m1.index).iloc[:,1:] #leave out the first column
m2=m2.rank()
numerator=((m1-m2) **2).sum(axis=0)
SpearmanFactor=pd.DataFrame(range(1,len(m1.columns)+1),index=m1.columns, columns=['colNo'])
I = SpearmanFactor['colNo'].max()+1
SpearmanFactor['divisor'] = (I-SpearmanFactor['colNo'])**3 - I +SpearmanFactor['colNo'] #(I-k)^3-I+k
SpearmanFactor['value']= 1-6*numerator.T/SpearmanFactor['divisor']
SpearmanFactor['weighted'] = SpearmanFactor['value'] * (I-SpearmanFactor['colNo']-1) / (SpearmanFactor[1:-1]['colNo']-1).sum() #weight sum excludes 1 and I
SpearmanCorr=SpearmanFactor['weighted'].iloc[1:-1].sum() # exlcuding 1st and last elements as not significant
SpearmanCorrVar = 2/((I-2)*(I-3))
return SpearmanCorr,SpearmanCorrVar
```

```
from scipy.stats import binom
def calendarCorrelation(df, pCritical=.1):
# Mack (1997) test for calendar year effect
# A calendar period has impact across developments if the probability of the number of small (or large)
# development factors in that period occurring randomly is less than pCritical
# df should have period as the row index, on the assumption that the first anti-diagonal is in relation to the same period (development=0)
m1=df.rank() # rank the development factors by column
med=m1.median(axis=0) # find the median value for each column
m1large=m1.apply(lambda r: r>med, axis=1) # sets to True those elements in each column which are large (above the median rank)
m1small=m1.apply(lambda r: r<med, axis=1)
m2large=m1large.to_numpy(copy=True)
m2small=m1small.to_numpy(copy=True)
S=[np.diag(m2small[:,::-1],k).sum() for k in range(min(m2small.shape),-1,-1)] # number of large elements in anti-diagonal (calendar year)
L=[np.diag(m2large[:,::-1],k).sum() for k in range(min(m2large.shape),-1,-1)] # number of large elements in anti-diagonal (calendar year)
probs=[binom.pmf(S[i], S[i]+L[i], 0.5)+binom.pmf(L[i], S[i]+L[i], 0.5) for i in range(len(S))] # probability of NOT having too many large or small items in anti-diagonal (calendar year)
return pd.Series([p<pCritical for p in probs[1:]], index=df.index)
```

And using the example in the paper

```
MackEx='''1,2,3,4,5,6,7,8,9
1.6,1.32,1.08,1.15,1.2,1.1,1.033,1,1.01
40.4,1.26,1.98,1.29,1.13,.99,1.043,1.03,
2.6,1.54,1.16,1.16,1.19,1.03,1.026,,
2,1.36,1.35,1.1,1.11,1.04,,,
8.8,1.66,1.4,1.17,1.01,,,,
4.3,1.82,1.11,1.23,,,,,
7.2,2.72,1.12,,,,,,
5.1,1.89,,,,,,,
1.7,,,,,,,,
'''
df=pd.read_csv(StringIO(MackEx), header=0)
df.index = df.index+1 #reindex rows from 1
dfCorr,dfCorrVar = developFactorsCorrelation(df)
print('Development factors correlation is {:.2%}'.format(dfCorr))
print('Factor independence if correlation is in range [{:.2%} to {:.2%}]'.format(-.67*np.sqrt(dfCorrVar), .67*np.sqrt(dfCorrVar)))
print('Dependence on calendar year:')
print(calendarCorrelation(df))
```

Result is

```
Development factors correlation is 6.96%
Factor independence if correlation is in range [-12.66% to 12.66%]
Dependence on calendar year:
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
dtype: bool
```

### Issue Analytics

- State:
- Created 3 years ago
- Comments:26 (13 by maintainers)

#### Top Results From Across the Web

TESTING THE ASSUMPTIONS OF AGE-TO-AGE FACTORS

This is a test failed by many development triangles, which means that the chain ladder method is not optimal for those triangles. The...

Read more >Development Tutorial — Reserving in Python

The chain ladder method is based on the strong assumptions of independence across origin periods and across valuation periods. Mack developed tests to ......

Read more >cyEffTest: Testing for Calendar Year Effect in ChainLadder

One of the three basic assumptions underlying the chain ladder method is the independence of the accident years. The function tests this assumption....

Read more >46 CFR § 160.017-27 - Production tests and examination.

... 160.017 - Chain Ladder; § 160.017-27 Production tests and examination. ... Each production test must be conducted or supervised by an independent...

Read more >Development tutorial - Jupyter Notebooks Gallery

The Chain Ladder method is based on the strong assumptions of independence across origin years and across valuation years. Mack developed tests to...

Read more >#### Top Related Medium Post

No results found

#### Top Related StackOverflow Question

No results found

#### Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free#### Top Related Reddit Thread

No results found

#### Top Related Hackernoon Post

No results found

#### Top Related Tweet

No results found

#### Top Related Dev.to Post

No results found

#### Top Related Hashnode Post

No results found

## Top GitHub Comments

Great, so this package leaps-frog R 😃

In respect of my previous

The answer is probably as simple as

which gives out very quickly an array with same shape as n and z

Thanks for the latest fixes @gig67. I think this one is done now. I’ll push a new release to pypi to make the enhancement available in the official package.