question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

REGR: Behavior change with empty apply in pandas 1.3.0rc1

See original GitHub issue

Problem description

The following (toy) snippet worked with 1.2:

df = pd.DataFrame(columns=["a", "b"])
df["a"] = df.apply(lambda x: x["a"], axis=1)

With 1.3 it fails with ValueError: Columns must be same length as key

Technically this is correct - the apply on an empty frame returns the empty frame so things do not really match.

Expected Output

It still works? Just reporting it here if this is an unintended change. Maybe I missed it, but I did not see this mentioned in the changelog.

The fix is to only call apply when the frame is not empty I guess? I stumbled upon this one when running our test suite.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:18 (14 by maintainers)

github_iconTop GitHub Comments

3reactions
simonjayhawkinscommented, Nov 27, 2021

Looking at this issue from purely a backport fix perspective, I doesn’t appear that we have any solutions here (for 1.3.x).

Changing the behavior of apply would not be suitable for a backport.

For backport, we would either need to:

  1. revert the change to setitem that caused the regression.
  2. catch the apply inconsistency in setitem

I think option 2 has been ruled out https://github.com/pandas-dev/pandas/issues/41997#issuecomment-860671005 and https://github.com/pandas-dev/pandas/issues/41997#issuecomment-860833386

I think option 1 is undesirable (especially late in the 1.3.x branch) since the change was a bugfix and is now correct behavior. https://github.com/pandas-dev/pandas/issues/41997#issuecomment-860629747

I propose we remove this from the 1.3.5 milestone.

1reaction
jordantshawcommented, Jul 27, 2021

I am also experiencing the same issue. Example below.

df = pd.DataFrame(columns=["a", "b"])
df['a'] = df.apply(lambda x: x["a"], axis=1)

I would expect ‘x’ in this case to be an empty Series, but instead it is returning an empty DataFrame. When as Any ideas on expected resolution?

Read more comments on GitHub >

github_iconTop Results From Across the Web

What's new in 1.3.0 (July 2, 2021) - Pandas
Styler.apply() now accepts functions that return an ndarray when axis=None , making it now consistent with the axis=0 and axis=1 behavior (GH39359).
Read more >
What's new in 1.4.0 (January 22, 2022) - Pandas
Ignoring dtypes in concat with empty or all-NA columns#. Note. This behaviour change has been reverted in pandas 1.4.3. When using concat() to...
Read more >
What's new in 1.4.3 (June 23, 2022) - Pandas
The behavior change in version 1.4.0 to stop ignoring the data type of empty or all-NA columns with float or object dtype in...
Read more >
What's new in 1.2.0 (December 26, 2020) - Pandas
0 (December 26, 2020)#. These are the changes in pandas 1.2.0. See Release notes for a full changelog including other versions of pandas....
Read more >
pandas.DataFrame.apply — pandas 1.5.2 documentation
Apply a function along an axis of the DataFrame. ... Functions that mutate the passed object can produce unexpected behavior or errors and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found