question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG-REPORT] Filtering breaks negative indexing

See original GitHub issue

Description The following code

import pandas as pd
import vaex

p_df = pd.DataFrame({"A": ["abc"] * 100})
df = vaex.from_pandas(p_df)
f_df = df[df["A"] == "abc"]

f_df[99]  # Works fine.
f_df[-1]  # Throws an error (same for any negative number).

throws

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Input In [3], in <cell line: 6>()
      3 f_df = df[df["A"] == "abc"]
      5 f_df[99]  # Works fine.
----> 6 f_df[-1]

File ~/mambaforge/envs/tmp_env/lib/python3.9/site-packages/vaex/dataframe.py:5337, in DataFrame.__getitem__(self, item)
   5335 if isinstance(item, int):
   5336     names = self.get_column_names()
-> 5337     return [self.evaluate(name, item, item+1, array_type='python')[0] for name in names]
   5338 elif isinstance(item, six.string_types):
   5339     if hasattr(self, item) and isinstance(getattr(self, item), Expression):

File ~/mambaforge/envs/tmp_env/lib/python3.9/site-packages/vaex/dataframe.py:5337, in <listcomp>(.0)
   5335 if isinstance(item, int):
   5336     names = self.get_column_names()
-> 5337     return [self.evaluate(name, item, item+1, array_type='python')[0] for name in names]
   5338 elif isinstance(item, six.string_types):
   5339     if hasattr(self, item) and isinstance(getattr(self, item), Expression):

File ~/mambaforge/envs/tmp_env/lib/python3.9/site-packages/vaex/dataframe.py:3090, in DataFrame.evaluate(self, expression, i1, i2, out, selection, filtered, array_type, parallel, chunk_size, progress)
   3088     return self.evaluate_iterator(expression, s1=i1, s2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size, progress=progress)
   3089 else:
-> 3090     return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size, progress=progress)

File ~/mambaforge/envs/tmp_env/lib/python3.9/site-packages/vaex/dataframe.py:6362, in DataFrameLocal._evaluate_implementation(self, expression, i1, i2, out, selection, filtered, array_type, parallel, chunk_size, raw, progress)
   6360     mask = self._selection_masks[FILTER_SELECTION_NAME]
   6361     i1, i2 = mask.indices(i1, i2-1)
-> 6362     assert i1 != -1
   6363     i2 += 1
   6364 # TODO: performance: can we collapse the two trims in one?

AssertionError: 

Software information

  • Vaex version (import vaex; vaex.__version__): {'vaex-core': '4.9.2', 'vaex-viz': '0.5.2', 'vaex-hdf5': '0.12.2', 'vaex-server': '0.8.1', 'vaex-astro': '0.9.1', 'vaex-jupyter': '0.8.0', 'vaex-ml': '0.17.0'}
  • Vaex was installed via: mamba install -c conda-forge vaex
  • OS: macOS Monterey, Version 12.4

Additional information No additional information to add.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
maartenbreddelscommented, Aug 31, 2022

Ok, this now works with master, so probably fixed in #2123

0reactions
maartenbreddelscommented, Aug 31, 2022

Ok, I was too quick, this is really fixed in https://github.com/vaexio/vaex/pull/2163 and will be released in the next version!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Slicing filtered DataFrame or Series (slicing observed results ...
As you can see, negative indexing works fine to get scalar values themselves, but it doesn't work to slice the Series of df...
Read more >
Working notes - Indexing++ - Eigen - TuxFamily
See this bug report. ... We aim to support various indexing mechanisms. ... negative indices: as in A(end-3) or A(3:2:end-1) ...
Read more >
The 'last' filter in django template default tags give negative ...
The 'last' filter in django template default tags give negative index error. This is because the code uses negative indexing to get the...
Read more >
dplyr.pdf
The filter() function is used to subset a data frame, retaining all rows that satisfy your conditions. To be retained, the row must...
Read more >
Safari Technology Preview Release Notes - Apple Developer
Fixed non-breaking space getting inserted instead of a normal space character ... Fixed negative z-index layers from triggering unnecessary compositing of ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found