question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG-REPORT] Fresh install of vaex cannot open files written to disk

See original GitHub issue

Thank you for reaching out and helping us improve Vaex!

Before you submit a new Issue, please read through the documentation. Also, make sure you search through the Open and Closed Issues - your problem may already be discussed or addressed.

Description

On a fresh notebook without vaex installed, run the following

!pip install vaex-core==4.15.0 vaex-hdf5==0.12.3

import vaex
import numpy as np

df = vaex.from_arrays(id=list(range(100_000)), emb=np.random.rand(100_000, 768))
df.export('file.hdf5')
df.export('file1.hdf5')

vaex.open("file*.hdf5")

You see the following

image

Software information

  • Vaex version (import vaex; vaex.__version__): {‘vaex-core’: ‘4.15.0’, ‘vaex-hdf5’: ‘0.12.3’}
  • Vaex was installed via: pip / conda-forge / from source
  • OS:

Additional information Please state any supplementary information or provide additional context for the problem (e.g. screenshots, data, etc…).

Issue Analytics

  • State:open
  • Created 10 months ago
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
Ben-Epsteincommented, Dec 2, 2022

@JovanVeljanoski i think it’s because of the legacy importlib, because it happens for arrow as well. I think @franz101 fix is the correct one https://github.com/vaexio/vaex/pull/2293

1reaction
franz101commented, Dec 1, 2022

Reproducing the error it seems to me related to pip install vaex vs pip install vaex-core or some other packages

Read more comments on GitHub >

github_iconTop Results From Across the Web

I/O Kung-Fu: get your data in and out of Vaex - vaex DataFrame
Opening such data is instantenous regardless of the file size on disk: Vaex will just memory-map the data instead of reading it in...
Read more >
how to read a 34Gb stata (.dta) file In python - Stack Overflow
By the looks of it, Pandas's Stata parser presently always reads the entire file into memory (and transforms it into a memory stream)....
Read more >
Hdf5 github - Campo dei Bambini
To review, open the file in an editor that reveals hidden Unicode ... Now we are going to install the HDF5 library into...
Read more >
Dataframe Systems: Theory, Architecture, and Implementation ...
community support: over 6,000 GitHub stars and 1 million installs to date. ... Data scientists often operate or iterate on new or unknown...
Read more >
Dataframe Systems: Theory, Architecture, and Implementation
In database terms, dataframes are more like views than tables. Programming languages like Python and R do not store data; they access data...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found