Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Remove fixed effects from summary_col

See original GitHub issue

(Following up from thread on mailing list: https://groups.google.com/forum/?hl=en#!topic/pystatsmodels/BnySoFBCcAE)

I’m estimating some simple OLS models that have dozens or hundreds of fixed effects terms, but I want to omit these estimates from the summary_col. Looking under the hood, it appears that the Summary object is just a DataFrame which means it should be possible to do some index slicing here to return the appropriate rows, but the Summary objects don’t support the basic DataFrame attributes and methods.

More formally:

import pandas as pd
import numpy as np
import string
import statsmodels.formula.api as smf
from statsmodels.iolib.summary2 import summary_col

df = pd.DataFrame({'A' : list(string.ascii_uppercase)*10,
                   'B' : list(string.ascii_lowercase)*10,
                   'C' : np.random.randn(260),
                   'D' : np.random.normal(size=260),
                   'E' : np.random.random_integers(0,10,260)})

m1 = smf.ols('E ~ D',data=df).fit()
m2 = smf.ols('E ~ D + C',data=df).fit()
m3 = smf.ols('E ~ D + C + B',data=df).fit()
m4 = smf.ols('E ~ D + C + B + A',data=df).fit()

print summary_col([m1,m2,m3,m4])

This returns a Summary object that has 55 rows (52 for the two fixed effects + the intercept + exogenous C and D terms). I would like a summary object that excludes the 52 fixed effects estimates and only includes the estimates for C, D, and the intercept for all four models. What’s the best way to remove fixed effects from the summary_col? Alternatively, how can I create a Summary object that only includes specific regressors and excludes the rest?