pd.Series.ffill() raise the error: AttributeError: 'numpy.ndarray' object has no attribute 'ffill'
See original GitHub issueSystem information
- OS Platform and Distribution - WIndows10:
- Modin version (
0.15.0+7.g4ec7f634
): - Python version 3.9.12:
- Code we can use to reproduce:
import modin
import modin.pandas as pd
from distributed import Client
import numpy as np
if __name__ == '__main__':
if pd.__name__ == 'modin.pandas':
client = Client(n_workers=3)
print(modin.__version__)
df = pd.DataFrame(
dict(
a=[1, 2, None, None, None, ],
b=(1, None, 3, 4, 5,),
)
)
df.a.ffill(inplace=True)
print(df)
df['tr_id'] = 0
df.tr_id = np.where(
(df.b <= 4),
3,
None,
)
df.tr_id.ffill(inplace=True)
Describe the problem
I have the output:
0.15.0+7.g4ec7f634
UserWarning: Ray execution environment not yet initialized. Initializing...
To remove this warning, run the following python code before doing dataframe operations:
import ray
ray.init()
UserWarning: Distributing <class 'dict'> object. This may take some time.
a b
0 1.0 1.0
1 2.0 NaN
2 2.0 3.0
3 2.0 4.0
4 2.0 5.0
Traceback (most recent call last):
File "d:\OD\OneDrive\Projects\Chud_Amaz\Soft_in_dev\moduled_way_OOP\modin_test_ffill.py", line 33, in <module>
df.tr_id.ffill(inplace=True)
AttributeError: 'numpy.ndarray' object has no attribute 'ffill'
Problem
When new column was filled with np.where
- method ffill
does not work
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:5 (3 by maintainers)
Top Results From Across the Web
'numpy.ndarray' object has no attribute 'fillna' - Stack Overflow
(M - 3) is getting interpreted as a numpy.ndarray . This implies that M is defined somewhere as a numpy.ndarray . Test it...
Read more >AttributeError: 'numpy.ndarray' object has no attribute 'columns'
The problem is that train_test_split(X, y, ...) returns numpy arrays and not pandas dataframes. Numpy arrays have no attribute named columns.
Read more >AttributeError: 'numpy.ndarray' object has no attribute 'columns ...
This looks amazing but I can't run it on my dataset. I get the following error: runfile('E:/Machine Learning Projects/ML ...
Read more >Python:AttributeError 'numpy.ndarray' object has no attribute ...
This is an error often encountered while doing data analysis such as machine learning using Python,Numpy,Pandas. I often forget, so I am writing ......
Read more >pandas.Series.shift — pandas 1.5.2 documentation
If freq is passed (in this case, the index must be date or datetime, or it will raise a NotImplementedError ), the index...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@VasilijKolomiets unfortunately it turns out that that won’t work in Modin either 😢 Modin assigns the right-hand side as-is to the attribute instead of pointing the attribute to the the new column.
What you can do instead is use
__setitem__
instead of__setattr__
, e.g.df['tr_id'] = np.where(...
instead ofdf.tr_id = np.where(...
. This works:Meanwhile, @pyrito will work on a PR that should fix all the bugs identified here.
In the snippet I posted above, the Modin dataframe’s
__setattr__
calls__setitem__
to modify thecol0
column in place. It then callsobject.__setattr__(self, key, value)
, which re-assigns thecol0
property to exactly the value that was passed in, i.e. the list. I think a fairly simple fix would be to callobject.__setattr__(self, key, self.__getitem__(key))
in the case here where we call__setitem__
.I can’t take this on right now, so I’ll leave it unassigned.