question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

numpy.load cannot read from tar archive in python3 ('_FileInFile' object has no attribute 'fileno')

See original GitHub issue

I would like to numpy.load an ndarray from a file within a tar archive, without extracting to disk, in Python3.

import os
import numpy
import tarfile

# create a test tar archive
a = numpy.random.rand(10, 4)
numpy.save('a.npy', a)
with open('foo.txt', 'w') as f:
    f.write('hey\nho\n')
t = tarfile.open('abc.tar', 'w')
t.add('a.npy')
t.add('foo.txt')
t.close()
del t, a
os.remove('a.npy')
os.remove('foo.txt')

# -----

t = tarfile.open('abc.tar', 'r')
a = numpy.load(t.extractfile('a.npy'))

The above code snippet works fine in Python 2.7.12 with numpy 1.11.1, but fails in Python 3.5.2 with numpy 1.11.1 (Debian Linux):

Traceback (most recent call last):
  File "loadtest.py", line 21, in <module>
    a = numpy.load(t.extractfile('a.npy'))
  File "[...]/lib/python3.5/site-packages/numpy/lib/npyio.py", line 406, in load
    pickle_kwargs=pickle_kwargs)
  File "[...]/lib/python3.5/site-packages/numpy/lib/format.py", line 648, in read_array
    array = numpy.fromfile(fp, dtype=dtype, count=count)
AttributeError: '_FileInFile' object has no attribute 'fileno'

Am I doing something wrong? The numpy docs say that I can pass a file-like object to numpy.load, which must support the seek() and read() methods, which the io.BufferedReader I receive from TarFile.extractfile() does.

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:5
  • Comments:8

github_iconTop GitHub Comments

3reactions
dmitriy-serdyukcommented, Oct 31, 2017

A hack to solve this is

array_file = BytesIO()
array_file.write(t.extractfile('a.npy').read())
array_file.seek(0)
numpy.load(array_file)
0reactions
dmitriy-serdyukcommented, Dec 10, 2018

@enricozb I haven’t tested this carefully, but it should load only one npy file from your tar. Make sure that you remove old BytesIO objects.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Can't seem to read a tar.gz file correctly into Python
Now file.read() returns class 'bytes' ; not sure how to read that into an numpy array. I've tried np.array(file.read()) # ValueError ...
Read more >
tarfile — Read and write tar archive files ... - Python Docs
The tarfile module makes it possible to read and write tar archives, including those using ... Open for reading with transparent compression (recommended)....
Read more >
Classifying flowers with Tensorflow - Kaggle
Explore and run machine learning code with Kaggle Notebooks | Using data from Flower ... here's several helpful packages to load import numpy...
Read more >
tar: cannot open: no such file or directory - linux - Super User
Run file download.php... long filename omitted. This should at least tell you if you have an actual archive. A common error with wget...
Read more >
tar Command - IBM
The tar command does not preserve the sparse nature of any file that is ... must be read twice, the total amount of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found