Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

.ne fails with abiguous message if comparing a list of columns containing column name 'dtype'

See original GitHub issue

Code Sample, a copy-pastable example if possible

df = pd.DataFrame([[0,1,2,'aa'],[0,1,2,'aa'],[0,1,5,'bb'],[0,1,5,'bb'],[0,1,5,'bb'],['cc',4,4,4]], columns=['a','b','c','dtype'])

df.loc[:, ['a', 'dtype']].ne(df.loc[:, ['a', 'dtype']])

In [10]: df
Out[10]: 
    a  b  c dtype
0   0  1  2    aa
1   0  1  2    aa
2   0  1  5    bb
3   0  1  5    bb
4   0  1  5    bb
5  cc  4  4     4

Problem description

Instead of the expected output, I receive:

In [8]: df.loc[:, ['a','dtype']].ne(df.loc[:, ['a', 'dtype']])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-a0710f18822f> in <module>()
----> 1 df.loc[:, ['a','dtype']].ne(df.loc[:, ['a', 'dtype']])

/home/blistein/env/aan/local/lib/python2.7/site-packages/pandas/core/ops.pyc in f(self, other, axis, level)
   1588                 self, other = self.align(other, 'outer',
   1589                                          level=level, copy=False)
-> 1590             return self._compare_frame(other, na_op, str_rep)
   1591 
   1592         elif isinstance(other, ABCSeries):

/home/blistein/env/aan/local/lib/python2.7/site-packages/pandas/core/frame.pyc in _compare_frame(self, other, func, str_rep)
   4790                 return {col: func(a[col], b[col]) for col in a.columns}
   4791 
-> 4792             new_data = expressions.evaluate(_compare, str_rep, self, other)
   4793             return self._constructor(data=new_data, index=self.index,
   4794                                      columns=self.columns, copy=False)

/home/blistein/env/aan/local/lib/python2.7/site-packages/pandas/core/computation/expressions.pyc in evaluate(op, op_str, a, b, use_numexpr, **eval_kwargs)
    201         """
    202 
--> 203     use_numexpr = use_numexpr and _bool_arith_check(op_str, a, b)
    204     if use_numexpr:
    205         return _evaluate(op, op_str, a, b, **eval_kwargs)

/home/blistein/env/aan/local/lib/python2.7/site-packages/pandas/core/computation/expressions.pyc in _bool_arith_check(op_str, a, b, not_allowed, unsupported)
    173         unsupported = {'+': '|', '*': '&', '-': '^'}
    174 
--> 175     if _has_bool_dtype(a) and _has_bool_dtype(b):
    176         if op_str in unsupported:
    177             warnings.warn("evaluating in Python space because the {op!r} "

/home/blistein/env/aan/local/lib/python2.7/site-packages/pandas/core/generic.pyc in __nonzero__(self)
   1574         raise ValueError("The truth value of a {0} is ambiguous. "
   1575                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1576                          .format(self.__class__.__name__))
   1577 
   1578     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Expected Output

       a      c
0  False  False
1  False  False
2  False  False
3  False  False
4  False  False
5  False  False

Alternative: a descriptive Error Message, telling me I can’t use ‘dtype’ as column name.

Output of `pd.show_versions()`

[paste the output of `pd.show_versions()` here below this line] INSTALLED VERSIONS

commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-29-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.utf8 LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.23.4 pytest: 2.8.7 pip: 18.0 setuptools: 40.0.0 Cython: None numpy: 1.11.0 scipy: 0.17.0 pyarrow: None xarray: None IPython: 5.5.0 sphinx: None patsy: 0.4.1 dateutil: 2.7.3 pytz: 2014.10 blosc: None bottleneck: None tables: 3.2.2 numexpr: 2.4.3 feather: None matplotlib: 1.5.1 openpyxl: 2.3.0 xlrd: 0.9.4 xlwt: 0.7.5 xlsxwriter: None lxml: 3.5.0 bs4: None html5lib: 1.0.1 sqlalchemy: None pymysql: 0.7.2.None psycopg2: 2.6.1 (dt dec mx pq3 ext lo64) jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Issue Analytics

State:
Created 5 years ago
Comments:9 (8 by maintainers)

Top GitHub Comments

1reaction

TomAugspurgercommented, Aug 18, 2018

Maybe pandas/tests/test_expressions.py

0reactions

baidoosikcommented, Aug 18, 2018

@TomAugspurger hi, i fix _has_bool_dtype function. but i don’t know which test directory is right directory. this is my first try to contribute open source…could you tell me which directory is more appropriate? thank you.