question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

IndexError when there are fewer DataFrame rows than workers

See original GitHub issue

When the number of rows is below the number of workers an IndexError is raised. Minimal example:

Code

import time
import pandas as pd
from pandarallel import pandarallel

pandarallel.initialize(progress_bar=True)

df = pd.DataFrame({'x':[1,2]})
df.parallel_apply(lambda row: print('A'), time.sleep(2), print('B'), axis=1)

Output

INFO: Pandarallel will run on 6 workers.
INFO: Pandarallel will use Memory file system to transfer data between the main process and workers.
B
   0.00%                                          |        0 /        1 |                                                                                                                    
   0.00%                                          |        0 /        1 |                                                                                                                    Traceback (most recent call last):
  File "foo.py", line 8, in <module>
    df.parallel_apply(lambda row: print('A'), time.sleep(2), print('B'), axis=1)
  File "$VIRTUAL_ENV/lib/python3.7/site-packages/pandarallel/pandarallel.py", line 446, in closure
    map_result,
  File "$VIRTUAL_ENV/lib/python3.7/site-packages/pandarallel/pandarallel.py", line 382, in get_workers_result
    progress_bars.update(progresses)
  File "$VIRTUAL_ENV/lib/python3.7/site-packages/pandarallel/utils/progress_bars.py", line 82, in update
    self.__bars[index][0] = value
IndexError: list index out of range

I’m using python version 3.7.4 with pandas 0.25.3 and pandarallel 1.4.4.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:5
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
hawktangcommented, May 27, 2020

I have all of row, but same issue here

File “/home/hawktang/anaconda3/envs/topic_classifier/lib/python3.7/multiprocessing/connection.py”, line 407, in _recv_bytes buf = self._recv(4) File “/home/hawktang/anaconda3/envs/topic_classifier/lib/python3.7/multiprocessing/connection.py”, line 383, in _recv raise EOFError EOFError

0reactions
jasonminsookimcommented, Sep 18, 2020

I’m still having this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas using qcut on series with fewer values than quantiles
In reality I have thousands of Series, all of which I need to cut. But qcut() runs into problem with an outlier row...
Read more >
Indexing and Selecting Data — pandas 0.15.0 documentation
This makes interactive work intuitive, as there's little new to learn if you ... will raise IndexError if an indexer is requested and...
Read more >
List Index Out of Range – Python Error [Solved] - freeCodeCamp
How to Fix the IndexError: list index out of range Error in Python Loops. Loops work with conditions. So, until a certain condition...
Read more >
Pandas: Append and Concat - Brett Romero
This last method can often be much faster than working with DataFrames directly, especially if we want to repeatedly append one row at...
Read more >
Data filtering in Pandas - Towards Data Science
Filtering data from a data frame is one of the most common ... selects employees with a salary higher than or equal to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found