question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tutorial 04 failing - AttributeError: 'DataFrame' object has no attribute '_data'

See original GitHub issue

Hello! I’ve been trying metaflow but I got stuck on tutorial number 04.

This is what i get when trying to execute it:

2020-08-11 17:09:02.454 Workflow starting (run-id 1597176542444702):
2020-08-11 17:09:03.378 [1597176542444702/start/1 (pid 7882)] Task is starting.
2020-08-11 17:09:04.594 [1597176542444702/start/1 (pid 7882)]     Internal error
2020-08-11 17:09:04.600 [1597176542444702/start/1 (pid 7882)] Traceback (most recent call last):
2020-08-11 17:09:04.600 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/cli.py", line 884, in main
2020-08-11 17:09:04.600 [1597176542444702/start/1 (pid 7882)]     start(auto_envvar_prefix='METAFLOW', obj=state)
2020-08-11 17:09:04.600 [1597176542444702/start/1 (pid 7882)] Using metadata provider: local@/Users/bi002708/OneDrive - BANCO INTER SA/metaflow-tutorials
2020-08-11 17:09:04.600 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/click/core.py", line 764, in __call__
2020-08-11 17:09:04.600 [1597176542444702/start/1 (pid 7882)]     return self.main(args, kwargs)
2020-08-11 17:09:04.664 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/click/core.py", line 717, in main
2020-08-11 17:09:04.664 [1597176542444702/start/1 (pid 7882)]     rv = self.invoke(ctx)
2020-08-11 17:09:04.664 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
2020-08-11 17:09:04.664 [1597176542444702/start/1 (pid 7882)]     return _process_result(sub_ctx.command.invoke(sub_ctx))
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/click/core.py", line 956, in invoke
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     return ctx.invoke(self.callback, ctx.params)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/click/core.py", line 555, in invoke
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     return callback(args, kwargs)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/click/decorators.py", line 27, in new_func
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     return f(get_current_context().obj, args, kwargs)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/cli.py", line 445, in step
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     max_user_code_retries)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/task.py", line 451, in run_step
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     output.persist(self.flow)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/datastore/datastore.py", line 48, in method
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     return f(self, args, kwargs)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/datastore/datastore.py", line 500, in persist
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     sha, size, encoding = self._save_object(obj, var, force_v4)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/datastore/datastore.py", line 429, in _save_object
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     transformable_obj.transform(lambda x: pickle.dumps(x, protocol=2))
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/datastore/datastore.py", line 66, in transform
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     temp = transformer(self._object)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/tmp/tmplj4lal94/metaflow/datastore/datastore.py", line 429, in <lambda>
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     transformable_obj.transform(lambda x: pickle.dumps(x, protocol=2))
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/pandas/core/generic.py", line 1931, in __getstate__
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     return dict(_data=self._data, _typ=self._typ, _metadata=self._metadata,
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]   File "/opt/anaconda3/envs/metaflow_PlayListFlow_osx-64_2b8c423287a339b3e7c633a6dc58580161663113/lib/python3.7/site-packages/pandas/core/generic.py", line 5063, in __getattr__
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)]     return object.__getattribute__(self, name)
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)] AttributeError: 'DataFrame' object has no attribute '_data'
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)] 
2020-08-11 17:09:04.665 [1597176542444702/start/1 (pid 7882)] Using analysis from 'Run('MovieStatsFlow/1597169402444620')'
2020-08-11 17:09:04.668 [1597176542444702/start/1 (pid 7882)] Task failed.
2020-08-11 17:09:04.668 Workflow failed.
2020-08-11 17:09:04.668 Terminating 0 active tasks...
2020-08-11 17:09:04.668 Flushing logs...
    Step failure:
    Step start (task-id 1) failed.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

6reactions
romain-intelcommented, Aug 25, 2020

We made a recent release with a comment to that effect. You are right that the environment you use to execute Episode 4 is isolated. The issue is that Episode 4 reads an artifact produced in Episode 2 which used your system Pandas version (it must have been 1.0+). Pandas has breaking changes between 0.24.2 and 1.0+ so 0.24.2 (the version used in Episode 4) can’t read the artifact produced by 1.0+ (the version used in Episode 2). In a future release, we will update the version of the Pandas version used in Episode 4 to be more recent and alleviate these issues. Apologies for not noting this in this issue at the time of release.

1reaction
vitorintercommented, Aug 11, 2020

Sure. I’ve just did it. The problem persists though.

Read more comments on GitHub >

github_iconTop Results From Across the Web

AttributeError: 'DataFrame' object has no attribute '_data'
I've seen such error when driver & executors had different version of Pandas installed. In my case it was driver with Pandas 1.1.0...
Read more >
AttributeError: 'DataFrame' object has no attribute 'data' - Reddit
I'm trying to set up a target to proceed with my Multi Linear Regression Project, but I can't even do that. I've already...
Read more >
I got the following error : 'DataFrame' object has no attribute ...
python - I got the following error : 'DataFrame' object has no attribute 'data' - Data Science Stack Exchange. Stack Overflow for Teams...
Read more >
Python Pandas error AttributeError DataFrame object has no ...
I am trying to print each entry of the dataframe separately. The dataframe is created by reading ... : 'DataFrame' object has no...
Read more >
How to Fix: module 'pandas' has no attribute 'dataframe'
If we use dataframe it will throw an error because there is no dataframe attribute in pandas. The method is DataFrame(). We need...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found