question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pd.Series.ffill() raise the error: AttributeError: 'numpy.ndarray' object has no attribute 'ffill'

See original GitHub issue

System information

  • OS Platform and Distribution - WIndows10:
  • Modin version (0.15.0+7.g4ec7f634):
  • Python version 3.9.12:
  • Code we can use to reproduce:
  import modin
  import modin.pandas as pd
  from distributed import Client
  
  import numpy as np
  
  
  if __name__ == '__main__':
      if pd.__name__ == 'modin.pandas':
          client = Client(n_workers=3)
          print(modin.__version__)

    df = pd.DataFrame(
        dict(
            a=[1, 2, None, None, None, ],
            b=(1, None, 3, 4, 5,),
        )
    )

    df.a.ffill(inplace=True)
    print(df)

    df['tr_id'] = 0

    df.tr_id = np.where(
        (df.b <= 4),
        3,
        None,
    )

    df.tr_id.ffill(inplace=True)

Describe the problem

I have the output:

0.15.0+7.g4ec7f634
UserWarning: Ray execution environment not yet initialized. Initializing...
To remove this warning, run the following python code before doing dataframe operations:

    import ray
    ray.init()

UserWarning: Distributing <class 'dict'> object. This may take some time.
     a    b
0  1.0  1.0
1  2.0  NaN
2  2.0  3.0
3  2.0  4.0
4  2.0  5.0
Traceback (most recent call last):
  File "d:\OD\OneDrive\Projects\Chud_Amaz\Soft_in_dev\moduled_way_OOP\modin_test_ffill.py", line 33, in <module>
    df.tr_id.ffill(inplace=True)
AttributeError: 'numpy.ndarray' object has no attribute 'ffill'

Problem

When new column was filled with np.where - method ffill does not work

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
mvashishthacommented, Jun 16, 2022

May beput resulting numpy in Series on right hand?

@VasilijKolomiets unfortunately it turns out that that won’t work in Modin either 😢 Modin assigns the right-hand side as-is to the attribute instead of pointing the attribute to the the new column.

import modin.pandas as pd

df = pd.DataFrame([[1]], columns=['col0'])
df.col0 = pd.Series([3])
df.iloc[0, 0] = 4
# BUG: col0 is unchanged!!!
assert df.col0.equals(df['col0'])

What you can do instead is use __setitem__ instead of __setattr__, e.g. df['tr_id'] = np.where(... instead of df.tr_id = np.where(.... This works:

import modin.pandas as pd

df = pd.DataFrame([[1]], columns=['col0'])
df['col0'] = pd.Series([3])
df.iloc[0, 0] = 4
assert df.col0.equals(df['col0'])

Meanwhile, @pyrito will work on a PR that should fix all the bugs identified here.

2reactions
mvashishthacommented, Jun 15, 2022

In the snippet I posted above, the Modin dataframe’s __setattr__ calls __setitem__ to modify the col0 column in place. It then calls object.__setattr__(self, key, value), which re-assigns the col0 property to exactly the value that was passed in, i.e. the list. I think a fairly simple fix would be to call object.__setattr__(self, key, self.__getitem__(key)) in the case here where we call __setitem__.

I can’t take this on right now, so I’ll leave it unassigned.

Read more comments on GitHub >

github_iconTop Results From Across the Web

'numpy.ndarray' object has no attribute 'fillna' - Stack Overflow
(M - 3) is getting interpreted as a numpy.ndarray . This implies that M is defined somewhere as a numpy.ndarray . Test it...
Read more >
AttributeError: 'numpy.ndarray' object has no attribute 'columns'
The problem is that train_test_split(X, y, ...) returns numpy arrays and not pandas dataframes. Numpy arrays have no attribute named columns.
Read more >
AttributeError: 'numpy.ndarray' object has no attribute 'columns ...
This looks amazing but I can't run it on my dataset. I get the following error: runfile('E:/Machine Learning Projects/ML ...
Read more >
Python:AttributeError 'numpy.ndarray' object has no attribute ...
This is an error often encountered while doing data analysis such as machine learning using Python,Numpy,Pandas. I often forget, so I am writing ......
Read more >
pandas.Series.shift — pandas 1.5.2 documentation
If freq is passed (in this case, the index must be date or datetime, or it will raise a NotImplementedError ), the index...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found