question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

jupyter notebook kernel crashes when doing groupby, having one row with NaN

See original GitHub issue

Code Sample, a copy-pastable example if possible


df.groupby('City').count()['Bib'].sort_values(ascending=False).head(20)

Problem description

Data set has 26630 rows and one row of df[“City”] has a NaN. The jupyter notebook kernel crashes.

Expected Output

A list of the top 20 cities is expected here.

df[df[‘City’].notnull()].groupby(‘City’).count()[‘Bib’].sort_values(ascending=False).head(20)

produces the expected output.

https://github.com/rojour/boston_results

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.5.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.11.3 scipy: 0.18.1 statsmodels: 0.6.1 xarray: None IPython: 5.1.0 sphinx: 1.5.1 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: 1.2.0 tables: 3.3.0 numexpr: 2.6.1 matplotlib: 2.0.0 openpyxl: 2.4.1 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.2 bs4: 4.5.3 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.1.5 pymysql: None psycopg2: None jinja2: 2.9.4 boto: 2.45.0 pandas_datareader: 0.2.1

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
jorisvandenbosschecommented, Mar 23, 2017

With the same versions, but on Windows, I cannot reproduce this (EDIT: same on linux)

0reactions
mroeschkecommented, May 8, 2021

It appears that this issue is not reproducible and our testing suite doesn’t support jupyter kernels. Unless we receive more specific information to reproduce the issue, going to close.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas Groupby makes kernel die in Jupyter notebook/Python
I noticed that when I only have two columns in groupby date and unit that I get many NaN value rows and then...
Read more >
Jupyter notebook crashes when trying to create a df-Pandas ...
The X_train.toarray() method changes the type of the spare matrix (containing only the non null entries) outputed by your CountVectorizer to a dense...
Read more >
Jupyter kernel crash when querying dataframe with Period ...
Solved: The problem is that one of the three date columns contained a number of entries with -9 in the original dataset.
Read more >
What's New — pandas 0.23.4 documentation - PyData |
This is a minor bug-fix release in the 0.23.x series and includes some small regression fixes and bug fixes. We recommend that all...
Read more >
Pandas & Ipython Crash on Merge : r/Python - Reddit
By grouping by and counting duplicates in one dataframe I was able to eliminate this issue. Still ended up using ~60GB of virtual...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found