question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pandas/io/feather_format.py should call use_threads instead of nthreads to prevent breakage in pyarrow 0.11.0

See original GitHub issue

Code Sample

d = {'one' : [1., 2., 3., 4.],
        'two' : [4., 3., 2., 1.]}
df = pandas.DataFrame(d)
df.to_feather('example.feather')

# with pyarrow 0.10.0 this succeeds with a deprecation warning
# with pyarrow 0.11.0 this errors with a TypeError: unexpected argument 'nthreads'
df = pandas.read_feather('example.feather')

# attempt to manually set nthreads results in TypeError: unexpectect argument 'nthreads'
df = pandas.read_feather('example.feather', nthreads=4)

# attempt to pass 'use_threads' results in TypeError: unexpected argument 'nthreads'
df = pandas.read_feather('example.feather', use_threads=True)

Problem description

Pandas introduced nthreads for reading feather files in issue 16359

With PyArrow 0.10.0 a deprecation warning is shown from this source: “nthreads argument is deprecated, pass use_threads instead”

When PyArrow version 0.11.0, Python errors with: TypeError: read_feather() got an unexpected keyword argument ‘nthreads’.

I’ve searched with ‘pyarrow’ and ‘nthreads’ keywords and didn’t see this issue posted.

Specifically feather-format.py line 112 should be changed to return feather.read_dataframe(path, use_threads=True) or changing the method signature to all overriding use_threads: return feather.read_dataframe(path, use_threads=use_threads) I will submit a PR if the only barrier to fix is code effort.

Expected Output

I expect no error output upon running pandas.read_feather() with PyArrow 0.11.0

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line] INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.4
pytest: None
pip: 18.1
setuptools: 40.3.0
Cython: None
numpy: 1.15.1
scipy: 1.1.0
pyarrow: 0.10.0
xarray: None
IPython: 6.5.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: 0.4.0
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: 1.2.11
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:2
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

15reactions
bartolsthoorncommented, Oct 16, 2018

Work-around might be useful to some people:

import feather
frame = feather.read_dataframe('filename.feather')
2reactions
TomAugspurgercommented, Dec 4, 2018

Next version of pandas. Aiming to have it out by the end of the year.

On Tue, Dec 4, 2018 at 12:21 PM Richard Anderson notifications@github.com wrote:

#23112 https://github.com/pandas-dev/pandas/pull/23112 So the fix will show up in the next release of pandas/pyarrow?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/23053#issuecomment-444203379, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQHIm_1ZurasUJ0Yaj33q1K2sH_NGT-ks5u1r0LgaJpZM4XPmBW .

Read more comments on GitHub >

github_iconTop Results From Across the Web

Apache Arrow 0.11.0 Release
Apache Arrow 0.11.0 (8 October 2018) This is a major release. ... logic can be simplified ARROW-2865 - [C++/Python] Reduce some duplicated ...
Read more >
pyarrow 0.11.0 - PyPI
This library provides a Python API for functionality provided by the Arrow C++ libraries, along with tools for Arrow integration and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found