Add generic buffer (s3) support to read_hdf
See original GitHub issueHey folks, this is not a bug but a question/feature request.
Today read_hdf
does not support reading hdf files directly from s3
.
If you try to pass an s3 url directly, as you can do with read_csv
you get a _file does not exist error message:
>>> import pandas as pd
>>> df = pd.read_hdf('s3://mybucket/myfile.h5', 'df)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/pandas/io/pytables.py", line 395, in read_hdf
raise FileNotFoundError(f"File {path_or_buf} does not exist")
FileNotFoundError: File s3://mybucket/myfile.h5 does not exist
And if we try to pass a file-like as the path, we get a error saying that support for generic buffers is not implemented
>>> import pandas as pd
>>> from s3fs import S3FileSystem
>>> s3 = S3FileSystem(anon=False)
>>> df = pd.read_hdf(s3.open('mybucket/myfile.h5', mode='rb'), 'df')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/pandas/io/pytables.py", line 385, in read_hdf
"Support for generic buffers has not been implemented."
NotImplementedError: Support for generic buffers has not been implemented.
Are there any plans to implement generic buffers for read_hdf
, so that we could read from s3 directly?
I took a quick look and this seems to be a restriction with tables not supporting s3 or file-like but I’m not sure.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:7 (4 by maintainers)
Top Results From Across the Web
read hdf file from google cloud storage using pandas
h"). My problem is that i always get this error : NotImplementedError: Support for generic buffers has not been implemented.
Read more >pandas.read_hdf — pandas 0.24.2 documentation
path_or_buf : string, buffer or path object. Path to the file to open, or an open pandas.HDFStore object. Supports any object implementing the...
Read more >Access Hdf Files Stored On S3 In Pandas - ADocLib
Parameters: pathorbuf : string buffer or path object.Path to the file to open or an open pandas.HDFStore object.Supports any object implementing the. When...
Read more >How can I read or write data to an AWS S3 Bucket? - Qvera
When you can't find a standard receiver or sender for connecting to a remote system, but that remote system has a Java library...
Read more >NEWS - OSGeo
Add /vsiaz for Microsoft Azure Blobs and /vsioss for Alibaba Cloud Object ... New for GDAL/OGR 1.9.0 * Read/write support for Generic Tagged...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
this not likely as HDF5 doesn’t have much support for this
Whats the current status of this issue? I faced a similiar problem and was wondering if there is an update on this, or if the workaround still is to download the files to the local file system first. In the mentioned issue someone wanted to work on this, but apparently nothing happened. Any feedback is greatly appreciated 😃