question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add Table `zip_columns` method for faster row iteration

See original GitHub issue

One way to substantially improve performance when iterating over rows is to avoid explicit use of Row for access. Instead of:

for row in tbl:
   function(row['a'], row['b'], row['z'])

Do:

for a, b, z in zip(tbl['a'], tbl['b'], tbl['z']):
   function(a, b, z)

This is much faster because it ends up using C-level iteration instead of a lot of Python code to make a Row and then access it.

In many cases of iterating over table rows, one only cares about a small subset of the column values. So we can make the above nicer with:

for a, b, z in tbl.zip_columns('a', 'b', 'z'):
   function(a, b, z)

One question is if the name is confusing. Technically it should be called zip_columns, but I’m not sure that would make more sense to the average user. [Changed to zip_columns].

@himanshupathak21061998 - this might be a good straightforward table performance issue to try.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
geekypathak21commented, Jul 6, 2019

Sure, I will try to fix this.

0reactions
mhvkcommented, Jul 9, 2019

Agreed - zip_columns is useful regardless - it also helps make sure the columns are zipped in an explicit order.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Faster way to iterate over rows - Stack Overflow
Try this : sweep(data_table, 1, mapping[[2]], "/"). In terms of speed here is a benchmark for the possibilities using iris dataset and ...
Read more >
How to iterate over rows in Pandas: Most efficient options
Most straightforward row iteration. The most straightforward method for iterating over rows is with the iterrows() method, like so:.
Read more >
Efficiently iterating over rows in a Pandas DataFrame
My problem was simple: I didn't know the fastest way to iterate over rows in Pandas. I often see people online using the...
Read more >
Efficiency of data frame row iteration - Blog by Bogumił Kamiński
A basic approach ... Assume we have a data frame that has two numeric columns :a and :b and we want to check...
Read more >
Different ways to iterate over rows in Pandas Dataframe
Let's see the Different ways to iterate over rows in Pandas Dataframe : Method 1: Using the index attribute ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found