Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problem with example notebook American National Election Studies (ANES) data: ValueError: The first guess on the deviance function returned a nan. This could be a boundary problem and should be reported.

See original GitHub issue

The example notebook with the election data is not working. The notebook in question is ANES_logistic_regression.ipynb.


import bambi as bmb
import pandas as pd
import numpy as np
import pymc3 as pm
import statsmodels.api as sm
import matplotlib.pyplot as plt
%matplotlib inline

data = pd.read_csv('ANES_2016_pilot.csv')
data.head()
data['vote'].value_counts()
data['party_id'].value_counts()

fig, ax = plt.subplots(3, figsize=(10,6))
key = dict(zip(data['party_id'].unique(),range(3)))
for label, df in data.groupby('party_id'):
    ax[key[label]].hist(df['age'])
    ax[key[label]].set_xlim([18,90])
    ax[key[label]].set_xlabel('Age')
    ax[key[label]].set_ylabel('Frequency')
    ax[key[label]].set_title(label)
    ax[key[label]].axvline(df['age'].mean())
plt.tight_layout()

pd.crosstab(data['vote'], data['party_id'])

clinton_data = data.loc[data['vote'].isin(['clinton','trump']),:]
clinton_data.head()

import bambi as bmb
clinton_model = bmb.Model(clinton_data)

clinton_model = bmb.Model(clinton_data)
clinton_fitted = clinton_model.fit('vote[clinton] ~ party_id + party_id:age', family="bernoulli", samples=1000, chains=4, init=None)

The error occurs after the above line. The full trace back is shown below.

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/frame.py:3140: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self[k1] = value[k2]
/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/generic.py:4388: FutureWarning: Attribute 'is_copy' is deprecated and will be removed in a future version.
  object.__getattribute__(self, name)
/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/generic.py:4389: FutureWarning: Attribute 'is_copy' is deprecated and will be removed in a future version.
  return object.__setattr__(self, name, value)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-005c4346e6ee> in <module>()
      1 clinton_model = bmb.Model(clinton_data)
      2 clinton_fitted = clinton_model.fit('vote[clinton] ~ party_id + party_id:age',
----> 3     family='bernoulli', samples=1000, chains=4, init=None)

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/bambi/models.py in fit(self, fixed, random, priors, family, link, run, categorical, backend, **kwargs)
    278         if run:
    279             if not self.built or backend != self._backend_name:
--> 280                 self.build(backend)
    281             return self.backend.run(**kwargs)
    282 

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/bambi/models.py in build(self, backend)
    218                 taylor = 5 if self.family.name == 'gaussian' else 1
    219             scaler = PriorScaler(self, taylor=taylor)
--> 220             scaler.scale()
    221 
    222         # For bernoulli models with n_trials = 1 (most common use case),

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/bambi/priors.py in scale(self)
    409 
    410             # scale it!
--> 411             getattr(self, '_scale_%s' % term_type)(t)

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/bambi/priors.py in _scale_fixed(self, term)
    308             mu += [0]
    309             sd += [self._get_slope_stats(exog=self.dm, predictor=pred,
--> 310                                          sd_corr=sd_corr)]
    311 
    312         # save and set prior

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/bambi/priors.py in _get_slope_stats(self, exog, predictor, sd_corr, full_mod, points)
    228                                 str(exog.columns[i])+'='+str(val),
    229                                 start_params=full_mod.params.values)
--> 230                     for val in values[:-1]]
    231             null = np.append(null, full_mod)
    232             ll = np.array([x.llf for x in null])

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/bambi/priors.py in <listcomp>(.0)
    228                                 str(exog.columns[i])+'='+str(val),
    229                                 start_params=full_mod.params.values)
--> 230                     for val in values[:-1]]
    231             null = np.append(null, full_mod)
    232             ll = np.array([x.llf for x in null])

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/statsmodels/genmod/generalized_linear_model.py in fit_constrained(self, constraints, start_params, **fit_kwds)
   1284         params, cov, res_constr = fit_constrained(self, R, q,
   1285                                                   start_params=start_params,
-> 1286                                                   fit_kwds=fit_kwds)
   1287         # create dummy results Instance, TODO: wire up properly
   1288         res = self.fit(start_params=params, maxiter=0)  # we get a wrapper back

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/statsmodels/base/_constraints.py in fit_constrained(model, constraint_matrix, constraint_values, start_params, fit_kwds)
    258     # using offset as keywords is not supported in all modules
    259     mod_constr = self.__class__(endog, exogp_st, offset=offset, **init_kwds)
--> 260     res_constr = mod_constr.fit(start_params=start_params, **fit_kwds)
    261     params_orig = transf.expand(res_constr.params).squeeze()
    262     cov_params = transf.transf_mat.dot(res_constr.cov_params()).dot(transf.transf_mat.T)

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/statsmodels/genmod/generalized_linear_model.py in fit(self, start_params, maxiter, method, tol, scale, cov_type, cov_kwds, use_t, full_output, disp, max_start_irls, **kwargs)
   1010             return self._fit_irls(start_params=start_params, maxiter=maxiter,
   1011                                   tol=tol, scale=scale, cov_type=cov_type,
-> 1012                                   cov_kwds=cov_kwds, use_t=use_t, **kwargs)
   1013         else:
   1014             self._optim_hessian = kwargs.get('optim_hessian')

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/statsmodels/genmod/generalized_linear_model.py in _fit_irls(self, start_params, maxiter, tol, scale, cov_type, cov_kwds, use_t, **kwargs)
   1107                                    self.freq_weights, self.scale)
   1108         if np.isnan(dev):
-> 1109             raise ValueError("The first guess on the deviance function "
   1110                              "returned a nan.  This could be a boundary "
   1111                              " problem and should be reported.")

ValueError: The first guess on the deviance function returned a nan.  This could be a boundary  problem and should be reported.

Issue Analytics

State:
Created 5 years ago
Comments:10 (3 by maintainers)

Top GitHub Comments

1reaction

MooersLabcommented, Mar 31, 2019

Hi Osvaldo,

Thank you very much your updating bambi and responding to this four month old open issue. I too found the notebook in question to be working today with Python 3.5 and Python 3.7 from macports.

I use anaconda (it is getting better all of the time), but I still prefer macports (faster installs, more packages, fewer dependency conflicts). The following may only interest other macports users. I had updated pymc3 through macports. This action updated the joblib module that pymc3 depends on to a version that is newer than what Bambi accepts. I had to uninstall joblib and pymc3 from my macports distribution. Then I installed Bambi with pip from the GitHub repository. Then I installed pymc3 with pip from the GitHub repository. I then found that the notebook worked with both python3.5 and python3.7.

On Sun, Mar 31, 2019 at 8:11 AM Osvaldo Martin notifications@github.com wrote:

I am not able to reproduce this. Is this still a problem?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bambinos/bambi/issues/129#issuecomment-478340588, or mute the thread https://github.com/notifications/unsubscribe-auth/AOeSC6Qkedguig5bFJ36xjCR4oN4zzLHks5vcLQQgaJpZM4ZCCN0 .

– Best regards,

Blaine

0reactions

MooersLabcommented, Mar 31, 2019

Time to close this issue.

Top Results From Across the Web

American National Election Studies: Home - ANES

The 2022 Pilot Study dataset and documentation are now available from the Data Center. The ANES 2022 Pilot Study is a cross-sectional survey...

How to Analyze ANES Survey Data

This is a “how-to” guide for the analysis of data from the American National Election Studies (ANES). Proper analysis of ANES data is...

Methodology Report for the ANES 2020 Time Series Study

Sections of this report reprint parts of previous documentation of the American National Election Studies without explicit attribution. The ...

ANES Continuity Guide - American National Election Studies

I. PARTISANSHIP AND ATTITUDES TOWARDS PARTIES. Topic. 2020⇨. ⇦1952. Note. Are there important differences in what the Republicans and ...

American National Election Studies (ANES)

http://www.electionstudies.org. Board Report. The American National Elections Study as “Gold Standard” for Survey Research in the Twenty-First Century.