bug when filling missing values with transform?
See original GitHub issueHello there,
Consider this
df = pd.DataFrame({'group' : ['A', 'A', 'A', 'B',
'B', 'B', 'B', 'B'],
'B' : [np.nan,np.nan,np.nan,-4,-2,5,8,7],
'C' : [-5,5,-20,0,np.nan,5,4,-4]})
df
Out[13]:
B C group
0 NaN -5.0 A
1 NaN 5.0 A
2 NaN -20.0 A
3 -4.0 0.0 B
4 -2.0 NaN B
5 5.0 5.0 B
6 8.0 4.0 B
7 7.0 -4.0 B
Now I want to fill forward the missing values in C
for each group
df.groupby('group').C.fillna(method ='ffill')
Out[11]:
0 -5.0
1 5.0
2 -20.0
3 0.0
4 0.0
5 5.0
6 4.0
7 -4.0
Name: C, dtype: float64
df.groupby('group').C.transform('ffill')
Out[12]:
0 -5.0
1 -5.0
2 -5.0
3 5.0
4 5.0
5 5.0
6 5.0
7 5.0
dtype: float64
the transform output is wrong. Is that expected? Pandas 18.1
Issue Analytics
- State:
- Created 7 years ago
- Comments:16 (10 by maintainers)
Top Results From Across the Web
IndexError when replacing missing values with mode using ...
The error is raised because for at least one of the groups the values in corresponding aggregated columns contains only np.nan values.
Read more >Using Panda's “transform” and “apply” to deal with missing ...
Learn how to use Pandas' transform and apply methods to deal with missing values. Detailed explanation, examples and code included.
Read more >Predict Missing Values Transformation: Fill In Missing Values
In a Transform node of a Data Prep recipe, select the dimension column with missing values in the Preview tab. · In the...
Read more >Find and fill missing values in a dataset - AWS Glue Studio
You can use the FillMissingValues transform to locate records in the dataset that have missing values and add a new field with a...
Read more >Working with missing data — pandas 1.5.2 documentation
By default, NaN values are filled whether they are inside (surrounded by) existing valid values, or outside existing valid values. The limit_area parameter ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Looks like https://github.com/pandas-dev/pandas/issues/24211 is the same issue and has a unit test so I think we are safe to close
@geoffrey-eisenbarth It appears so. I’d also recommend searching the groupby tests for fillna used in transform - perhaps one already exists and this issue wasn’t known about.