Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

jupyter notebook kernel crashes when doing groupby, having one row with NaN

See original GitHub issue

Code Sample, a copy-pastable example if possible


df.groupby('City').count()['Bib'].sort_values(ascending=False).head(20)

Problem description

Data set has 26630 rows and one row of df[“City”] has a NaN. The jupyter notebook kernel crashes.

Expected Output

A list of the top 20 cities is expected here.

df[df[‘City’].notnull()].groupby(‘City’).count()[‘Bib’].sort_values(ascending=False).head(20)

produces the expected output.

https://github.com/rojour/boston_results

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None python: 3.5.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.11.3 scipy: 0.18.1 statsmodels: 0.6.1 xarray: None IPython: 5.1.0 sphinx: 1.5.1 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: 1.2.0 tables: 3.3.0 numexpr: 2.6.1 matplotlib: 2.0.0 openpyxl: 2.4.1 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.2 bs4: 4.5.3 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.1.5 pymysql: None psycopg2: None jinja2: 2.9.4 boto: 2.45.0 pandas_datareader: 0.2.1

Issue Analytics

State:
Created 6 years ago
Comments:10 (7 by maintainers)

Top GitHub Comments

1reaction

jorisvandenbosschecommented, Mar 23, 2017

With the same versions, but on Windows, I cannot reproduce this (EDIT: same on linux)

0reactions

mroeschkecommented, May 8, 2021

It appears that this issue is not reproducible and our testing suite doesn’t support jupyter kernels. Unless we receive more specific information to reproduce the issue, going to close.

Top Results From Across the Web

Pandas Groupby makes kernel die in Jupyter notebook/Python

I noticed that when I only have two columns in groupby date and unit that I get many NaN value rows and then...

Jupyter notebook crashes when trying to create a df-Pandas ...

The X_train.toarray() method changes the type of the spare matrix (containing only the non null entries) outputed by your CountVectorizer to a dense...

Jupyter kernel crash when querying dataframe with Period ...

Solved: The problem is that one of the three date columns contained a number of entries with -9 in the original dataset.

What's New — pandas 0.23.4 documentation - PyData |

This is a minor bug-fix release in the 0.23.x series and includes some small regression fixes and bug fixes. We recommend that all...

Pandas & Ipython Crash on Merge : r/Python - Reddit

By grouping by and counting duplicates in one dataframe I was able to eliminate this issue. Still ended up using ~60GB of virtual...