question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Avoid Kedro fsspec requirements being mutually incompatible with pandas 1.1.0

See original GitHub issue

Description

Kedro 16.4 enforces fsspec<0.7.0,>=0.5.1 & PANDAS = "pandas>=0.24, <1.0.4" in setup.py

However,

  1. the pd upper version limit is not enforced, so newer pd versions can coexist after pip install kedro
  2. The blocking issue (for kedro including higher pd versions) on parquet reads in pandas is resolved re: https://github.com/pandas-dev/pandas/issues/34467 in pd >1.0.5, so the key blocker on Kedro upgrading pandas versions no longer exists.

IF kedro allows pandas 1.1.0, then you are going to hit an incompatibility with fsspec, as pandas 1.1.0 requires fsspec>=0.7.4.

Context

You cannot run kedro and pandas>1.1.0 on the same environment. Pandas needs fsspec>0.7.4.

There are meaningful improvements in newer pandas, so I would like to be able to run them together out-of-the-box.

The current reason to not allow higher pd versions as per setup.py in the kedro source code (https://github.com/pandas-dev/pandas/issues/34467) is no longer applicable, so i think it is time to make this change unless fsspec has deal-breaking problems in later versions.

Possible Implementation

Can you just update fsspec version requirements to fsspec<0.7.4,>=0.5.1? I’m not aware of any major problems in the newer versions

Possible Alternatives

Raise warnings or errors if kedro co-exists with pd>=1.1.0

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:17 (9 by maintainers)

github_iconTop GitHub Comments

5reactions
tamsanhcommented, Aug 31, 2020

Oops. FYI, I’m running into an incompatibility with s3fs 0.5.0 as it requires fsspec>=0.8.0.

2reactions
JMBurleycommented, Sep 17, 2020

Thanks both,

Would I be correct in understanding the official position here as: “all file loads should be handled by .yml in your catalog folder; it is anti-pattern to directly read files with pandas, therefore we don’t view it as critical for kedro to facilitate pandas s3 functionality, nor maintain it if it is incidentally present in some kedro versions”?

I understand the position if so, although I think there are good reasons to prototype eda in a project using pandas s3 functionality rather than hopping in and out of the data catalog to create potentially disposable items.

Regardless, looking forward to v0.17.0!

PS. I agree that anyone who doesn’t use virtual environments deserves the problems they encounter

Read more comments on GitHub >

github_iconTop Results From Across the Web

Avoid Kedro fsspec requirements being mutually incompatible ...
IF kedro allows pandas 1.1.0, then you are going to hit an incompatibility with fsspec, as pandas 1.1.0 requires fsspec>=0.7.4. Context. You ...
Read more >
ERROR: google-colab 1.0.0 has requirement pandas~=1.0.0 ...
I encounter the following issue: ERROR: fastai 2.0.6 has requirement pandas>=1.1.0, but you'll have pandas 1.0.5 which is incompatible.
Read more >
conda-forge - :: Anaconda.org
conda-ecosystem-user-package-isolation, 1.0, BSD-3-Clause, X, X, X, Prevent Python and R from searching for packages outside of the current.
Read more >
[FIXED] Jupyter Notebook Cannot Connect to Kernel, Likely ...
Issue. All of my virtual environments work fine, except for one in which the jupyter notebook won't connect for kernel.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found