Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Regression in getitem for SparseArray with greater comparison

See original GitHub issue

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

#44955 broke some things with greater comparisons. We should fix them before 1.4 is released or revert here.


s = pd.arrays.SparseArray([1, 2, 3, 4, np.nan, np.nan], fill_value=np.nan)
s[s>2]

Issue Description

This returns

[1.0, 2.0, 3.0, 4.0]
Fill: nan
IntIndex
Indices: array([0, 1, 2, 3], dtype=int32)

Expected Behavior

This returned

[3.0, 4.0]
Fill: nan
IntIndex
Indices: array([0, 1], dtype=int32)

before which was correct.

Installed Versions

INSTALLED VERSIONS

commit : a51cd1053e33a2d0dcf5e94efe7cd8ea2d15a1ee python : 3.8.12.final.0 python-bits : 64 OS : Linux OS-release : 5.11.0-43-generic Version : #47~20.04.2-Ubuntu SMP Mon Dec 13 11:06:56 UTC 2021 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.4.0.dev0+1516.ga51cd1053e numpy : 1.21.5 pytz : 2021.1 dateutil : 2.8.2 pip : 21.3.1 setuptools : 59.8.0 Cython : 0.29.24 pytest : 6.2.5 hypothesis : 6.23.1 sphinx : 4.2.0 blosc : None feather : None xlsxwriter : 3.0.1 lxml.etree : 4.6.3 html5lib : 1.1 pymysql : None psycopg2 : None jinja2 : 3.0.1 IPython : 7.28.0 pandas_datareader: None bs4 : 4.10.0 bottleneck : 1.3.2 fsspec : 2021.11.0 fastparquet : 0.7.1 gcsfs : 2021.05.0 matplotlib : 3.4.3 numexpr : 2.7.3 odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : 1.0.1 pyxlsb : None s3fs : 2021.11.0 scipy : 1.7.2 sqlalchemy : 1.4.25 tables : 3.6.1 tabulate : 0.8.9 xarray : 0.18.0 xlrd : 2.0.1 xlwt : 1.3.0 numba : 0.53.1 zstandard : None None

Process finished with exit code 0

Issue Analytics

State:
Created 2 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

phoflcommented, Dec 29, 2021

See #41957

0reactions

bdrumcommented, Jan 9, 2022

@jreback, @phofl this also could be closed. Fixed and tested by #45125

Top Results From Across the Web

What's new in 1.4.0 (January 22, 2022) - Pandas

Bug in DataFrame.truncate() and Series.truncate() when the object's Index has a length greater than one but only one unique value (GH42365).

What's New — pandas 0.19.2 documentation

This is a minor bug-fix release in the 0.19.x series and includes some small regression fixes, bug fixes and performance improvements.

What's New - Xarray

This release reverts a regression in xarray's unstacking of dask-backed arrays. ... Fix time encoding bug associated with using cftime versions greater than...

python-pandas-0.23.4-bp151.2.3 - SUSE Package Hub -

Update to 0.23.1 + Fixed Regressions * Reverted change to comparing a Series ... only returned the shape SparseArray.sp_values (GH21126) * Indexing >...

Changelog - Dask documentation

Fix caching-related MaterializedLayer.cull performance regression ... Bump pre-release version to be greater than stable releases (GH#8728) Charles ...