[BUG-REPORT] Fresh install of vaex cannot open files written to disk
See original GitHub issueThank you for reaching out and helping us improve Vaex!
Before you submit a new Issue, please read through the documentation. Also, make sure you search through the Open and Closed Issues - your problem may already be discussed or addressed.
Description
On a fresh notebook without vaex installed, run the following
!pip install vaex-core==4.15.0 vaex-hdf5==0.12.3
import vaex
import numpy as np
df = vaex.from_arrays(id=list(range(100_000)), emb=np.random.rand(100_000, 768))
df.export('file.hdf5')
df.export('file1.hdf5')
vaex.open("file*.hdf5")
You see the following

Software information
- Vaex version (
import vaex; vaex.__version__)
: {‘vaex-core’: ‘4.15.0’, ‘vaex-hdf5’: ‘0.12.3’} - Vaex was installed via: pip / conda-forge / from source
- OS:
Additional information Please state any supplementary information or provide additional context for the problem (e.g. screenshots, data, etc…).
Issue Analytics
- State:
- Created 10 months ago
- Comments:13 (13 by maintainers)
Top Results From Across the Web
I/O Kung-Fu: get your data in and out of Vaex - vaex DataFrame
Opening such data is instantenous regardless of the file size on disk: Vaex will just memory-map the data instead of reading it in...
Read more >how to read a 34Gb stata (.dta) file In python - Stack Overflow
By the looks of it, Pandas's Stata parser presently always reads the entire file into memory (and transforms it into a memory stream)....
Read more >Hdf5 github - Campo dei Bambini
To review, open the file in an editor that reveals hidden Unicode ... Now we are going to install the HDF5 library into...
Read more >Dataframe Systems: Theory, Architecture, and Implementation ...
community support: over 6,000 GitHub stars and 1 million installs to date. ... Data scientists often operate or iterate on new or unknown...
Read more >Dataframe Systems: Theory, Architecture, and Implementation
In database terms, dataframes are more like views than tables. Programming languages like Python and R do not store data; they access data...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@JovanVeljanoski i think it’s because of the legacy importlib, because it happens for arrow as well. I think @franz101 fix is the correct one https://github.com/vaexio/vaex/pull/2293
Reproducing the error it seems to me related to
pip install vaex
vspip install vaex-core
or some other packages