Add Table `zip_columns` method for faster row iteration
See original GitHub issueOne way to substantially improve performance when iterating over rows is to avoid explicit use of Row
for access. Instead of:
for row in tbl:
function(row['a'], row['b'], row['z'])
Do:
for a, b, z in zip(tbl['a'], tbl['b'], tbl['z']):
function(a, b, z)
This is much faster because it ends up using C-level iteration instead of a lot of Python code to make a Row
and then access it.
In many cases of iterating over table rows, one only cares about a small subset of the column values. So we can make the above nicer with:
for a, b, z in tbl.zip_columns('a', 'b', 'z'):
function(a, b, z)
One question is if the name is confusing. Technically it should be called [Changed to zip_columns
, but I’m not sure that would make more sense to the average user.zip_columns
].
@himanshupathak21061998 - this might be a good straightforward table performance issue to try.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Faster way to iterate over rows - Stack Overflow
Try this : sweep(data_table, 1, mapping[[2]], "/"). In terms of speed here is a benchmark for the possibilities using iris dataset and ...
Read more >How to iterate over rows in Pandas: Most efficient options
Most straightforward row iteration. The most straightforward method for iterating over rows is with the iterrows() method, like so:.
Read more >Efficiently iterating over rows in a Pandas DataFrame
My problem was simple: I didn't know the fastest way to iterate over rows in Pandas. I often see people online using the...
Read more >Efficiency of data frame row iteration - Blog by Bogumił Kamiński
A basic approach ... Assume we have a data frame that has two numeric columns :a and :b and we want to check...
Read more >Different ways to iterate over rows in Pandas Dataframe
Let's see the Different ways to iterate over rows in Pandas Dataframe : Method 1: Using the index attribute ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sure, I will try to fix this.
Agreed -
zip_columns
is useful regardless - it also helps make sure the columns are zipped in an explicit order.