question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

fillna() does not work when value parameter is a list

See original GitHub issue

Should raise on a passed list to value

The results from the fillna() method are very strange when the value parameter is given a list.

For example, using a simple example DataFrame:

df = pandas.DataFrame({‘A’: [numpy.nan, 1, 2], ‘B’: [10, numpy.nan, 12], ‘C’: [[20, 21, 22], [23, 24, 25], numpy.nan]}) df A B C 0 NaN 10 [20, 21, 22] 1 1 NaN [23, 24, 25] 2 2 12 NaN

df.fillna(value=[100, 101, 102]) A B C 0 100 10 [20, 21, 22] 1 1 101 [23, 24, 25] 2 2 12 102

So it appears the values in the list are used to fill the ‘holes’ in order, if the list has the same length as number of holes. But if the the list is shorter than the number of holes, the behavior changes to using only the first value in the list:

df.fillna(value=[100, 101]) A B C 0 100 10 [20, 21, 22] 1 1 100 [23, 24, 25] 2 2 12 100

If the list is longer than the number of holes, you get something even more odd:

df.fillna(value=[100, 101, 102, 103]) A B C 0 100 10 [20, 21, 22] 1 1 100 [23, 24, 25] 2 2 12 102

If you specify provide a dict that specifies the fill values by column, the values from the list are used within that column only:

df.fillna(value={‘C’: [100, 101]}) A B C 0 NaN 10 [20, 21, 22] 1 1 NaN [23, 24, 25] 2 2 12 100

Since it’s not always practical to know the number of NaN values a priori, or to customize the length of the value list to match it, this is problematic. Furthermore, some desired values get over-interpreted and cannot be used:

For example, if you want to actually replace all NaN instances in a single column with the same list (either empty or non-empty), I can’t figure out how to do it:

df.fillna(value={‘C’: [[100,101]]}) A B C 0 NaN 10 [20, 21, 22] 1 1 NaN [23, 24, 25] 2 2 12 100

Indeed, if you specify the empty list nothing is filled:

df.fillna(value={‘C’: list()}) A B C 0 NaN 10 [20, 21, 22] 1 1 NaN [23, 24, 25] 2 2 12 NaN

But a dict works fine:

f.fillna(value={‘C’: {0: 1}}) A B C 0 NaN 10 [20, 21, 22] 1 1 NaN [23, 24, 25] 2 2 12 {0: 1}

df.fillna(value={‘C’: dict()}) A B C 0 NaN 10 [20, 21, 22] 1 1 NaN [23, 24, 25] 2 2 12 {}

So it appears the fillna() is making a lot of decisions about how the fill values should be applied, and certain desired outcomes can’t be achieved because it’s being too ‘clever’.

Issue Analytics

  • State:closed
  • Created 10 years ago
  • Comments:15 (9 by maintainers)

github_iconTop GitHub Comments

12reactions
BrenBarncommented, Apr 26, 2014

Is there an actual solution to this? What are you supposed to do if you actually want a DataFrame/Series whose values are lists, and you want to replace NaN values with an empty list?

0reactions
Pranjalyacommented, Jun 19, 2020

The dict doesn’t work now. 😦

Read more comments on GitHub >

github_iconTop Results From Across the Web

Use empty list as fill value for Series.fillna - Stack Overflow
I need to change the NaN values the column 'FocusColumn' to empty lists. ... What is the correct way to do it? ......
Read more >
Pandas Series: fillna() function - w3resource
Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list. scalar, dict, Series, or DataFrame, Required.
Read more >
DataFrame.fillna - Dask documentation
Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list. ... For Series this parameter is unused and...
Read more >
DataFrame. fillna - pandas 0.20.3 documentation
Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a...
Read more >
Handling Missing Data | Python Data Science Handbook
The way in which Pandas handles missing values is constrained by its reliance on the NumPy package, which does not have a built-in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found