Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cronbach's alpha implementation (with code)

See original GitHub issue

Cronbach’s alpha is a measurement of internal consistency.

http://www.ats.ucla.edu/stat/spss/faq/alpha.html

https://en.wikipedia.org/wiki/Cronbach's_alpha

I am new to python but I have working code that can be improved (see below) but I do not know where to insert it for Statsmodel

import numpy as np


def svar(X):
    n = float(len(X))
    svar=(sum([(x-np.mean(X))**2 for x in X]) / n)* n/(n-1.)
    return svar


def CronbachAlpha(itemscores):
    itemvars = [svar(item) for item in itemscores]
    tscores = [0] * len(itemscores[0])
    for item in itemscores:
       for i in range(len(item)):
          tscores[i]+= item[i]
    nitems = len(itemscores)
    print "total scores=", tscores, 'number of items=', nitems

    Calpha=nitems/(nitems-1.) * (1-sum(itemvars)/ svar(tscores))

    return Calpha

###########Test################
itemscores = [[ 4,14,3,3,23,4,52,3,33,3],
              [ 5,14,4,3,24,5,55,4,15,3]]
print "Cronbach alpha = ", CronbachAlpha(itemscores)

I would suggest merging the two functions and change it for numpy arrays before implementing it.

Issue Analytics

State:
Created 10 years ago
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

jmarreccommented, Jul 22, 2015

Has this been implemented yet?

As I wrote on the original stack overflow post here:

For anyone being extremely puzzled as to why it would return always close to 1.0, you have to mind that here itemscores is n*p, with n (each row) being item (a question), and p (each column) being your subject’s answer. If you’re using pandas like I am, chances are you have each row being the respondent and each column being the item. So to use this function, you need to transpose the dataframe or modify the function. Also mind that in Python 2.7, you need to import division from future or enclose the denominators in float().

I think the documentation on the pull request was confusing or wrong, as it states that itemscores is a ‘n*p array or dataframe, n subjects and p items’ while it clearly is the transpose of that.

To be able to use it the other way (and that it works with a dataframe):

I’d modify the code to:

def CronbachAlpha(itemscores):
    itemscores = np.asarray(itemscores)
    itemvars = itemscores.var(axis=0, ddof=1)
    tscores = itemscores.sum(axis=1)
    nitems = itemscores.shape[1]
    calpha = nitems / float(nitems-1) * (1 - itemvars.sum() / float(tscores.var(ddof=1)))

    return calpha

0reactions

josef-pktcommented, Nov 6, 2019

It looks like McDonalds omega requires a factor analysis as a first step http://www.personality-project.org/r/html/omega.html

Factor analysis in statsmodels is still pretty new. so it wasn’t available in the old discussion and work on agreement and reliability measures. Maybe omega becomes feasible now.

(Some of this seems to be in the direction of Itemresponse theory and factor analysis for categorical variables. I never got beyond reading a few abstracts with that.)