question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance improvement: sfaira.data.store.io_data.read_dao

See original GitHub issue

The sfaira.data.store.io_data.read_dao function spends most of it’s time reading in a pickle file: https://github.com/theislab/sfaira/blob/aeaa60ff128046b7564aa9cccd6293a9300e5a31/sfaira/data/store/io_dao.py#L117

Maybe there is a more efficient way to read in the necessary data.

Here’s a screenshot of the profiler output:

profile-load-store-dao

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
felix0097commented, Nov 26, 2021

Just checked - the profiler output does not really make much sense cause the uns data files are super small, so should be fast to read in.

Ran pyinstrument profiler instead of the cProfile and things look much more sensible now: image

Looking at this, it doesn’t make much sense to change the current implementation away from using pickle.

0reactions
felix0097commented, Dec 20, 2021

This is fixed now with the above mentioned pull request. Issue was that the Anndata OverloadedDict saved lots of unnecessary data which made the saved pickle files extremely large.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data stores — sfaira v0.3.11+5.g8ebd63b documentation
The DAO store format is a on-disk representation of single-cell data which is optimised for generator-based access and distributed access. In brief, DAO...
Read more >
Sfaira accelerates data and model reuse in single cell genomics
Sfaira is a data and model zoo that automates common steps in exploratory single-cell RNA-seq analysis. a Overview workflow of sfaira data ......
Read more >
Untitled
Magic sfera, 75 watt equivalent led candelabra bulb, Orion images 22tpd, ... Vitem vii, Ue46es6100, Data of unemployment in india, Larissa comforter, ...
Read more >
Untitled
Small baby gain weight! Accuphase dp-900/dc901, Lavendel bauerngut schiefelbusch. Default page setting in google chrome, Awm 10070, Valentin speranski, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found