BUG: Multiindex.nunique raises NotImplementedError
See original GitHub issue-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
pd.DataFrame([[1,2],[1,2]]).set_index([0,1]).index.nunique()
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-80-09716b89b157> in <module>
----> 1 pd.DataFrame([[1,2],[1,2]]).set_index([0,1]).index.nunique()
/opt/conda/lib/python3.7/site-packages/pandas/core/base.py in nunique(self, dropna)
1284 uniqs = self.unique()
1285 n = len(uniqs)
-> 1286 if dropna and isna(uniqs).any():
1287 n -= 1
1288 return n
/opt/conda/lib/python3.7/site-packages/pandas/core/dtypes/missing.py in isna(obj)
124 Name: 1, dtype: bool
125 """
--> 126 return _isna(obj)
127
128
/opt/conda/lib/python3.7/site-packages/pandas/core/dtypes/missing.py in _isna_new(obj)
136 # hack (for now) because MI registers as ndarray
137 elif isinstance(obj, ABCMultiIndex):
--> 138 raise NotImplementedError("isna is not defined for MultiIndex")
139 elif isinstance(obj, type):
140 return False
NotImplementedError: isna is not defined for MultiIndex
exactly same error raised by pandas 0.25.3
Problem description
The method exists, but always throws NotImplementedError. I can use len(df.index.unique()) as a substitute. In the previous version of pandas, documentation have Multiindex.nunique entry (https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.MultiIndex.unique.html), but it has disappeared now. So I think it might be possible that this is not a bug, but an intended exception.
Expected Output
should be 1
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None python : 3.7.6.final.0 python-bits : 64 OS : Linux OS-release : 4.19.76-linuxkit machine : x86_64 processor : x86_64 byteorder : little LC_ALL : en_US.UTF-8 LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 1.0.3 numpy : 1.18.1 pytz : 2020.1 dateutil : 2.8.1 pip : 20.1 setuptools : 46.1.3.post20200325 Cython : 0.29.17 pytest : 5.4.1 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 1.2.8 lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 2.11.2 IPython : 7.14.0 pandas_datareader: None bs4 : 4.9.0 bottleneck : 1.3.2 fastparquet : None gcsfs : None lxml.etree : None matplotlib : 3.2.1 numexpr : 2.7.1 odfpy : None openpyxl : 3.0.3 pandas_gbq : None pyarrow : 0.17.0 pytables : None pytest : 5.4.1 pyxlsb : None s3fs : None scipy : 1.4.1 sqlalchemy : 1.3.16 tables : 3.6.1 tabulate : None xarray : 0.15.1 xlrd : 1.2.0 xlwt : None xlsxwriter : 1.2.8 numba : 0.48.0
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:5 (3 by maintainers)
Top GitHub Comments
To me this feels like a bug, or at the very least is probably not the intended behavior. I think defining isna on a MultiIndex so this doesn’t raise could make sense (or if not skip the NA check for MultiIndex). It does beg the question what it means for a MultiIndex value to be NA: do all the values need to be NA, or at least one (if the latter then the adjustment within nunique becomes more complicated)?
take