question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

highly_variable_genes - issue

See original GitHub issue

Hi there,

While running sc.pp.highly_variable_genes(adata.X) I got the following error:

AttributeError: X not found

I then ran sc.pp.highly_variable_genes(adata) and got the following:

ValueError: Bin edges must be unique: array([nan, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf,inf, inf, inf, inf, inf, inf, inf, inf]). You can drop duplicate edges by setting the duplicates kwarg

The older sc.pp.filter_genes_dispersion(adata.X) works fine.

Do you know how to fix this?

Thank you!

Info: scanpy==1.3.4 anndata==0.6.13 numpy==1.15.3 scipy==1.1.0 pandas==0.23.4 scikit-learn==0.20.0 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:21 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
kanefoscommented, Dec 27, 2020

I have an AnnData object whose .X matrix has been transformed by size factor division, +1 and log. Subsequent sc.pp.highly_variable_genes(dataset, flavor='cell_ranger', n_top_genes=1000) yields the ValueError: Bin edges must be unique: ... You can drop duplicate edges by setting the 'duplicates' kwarg error discussed above. Transformation to a sparse matrix did not alleviate the error, and neither did any other solutions suggested.

Edit: However! While I could not get flavor='cell_ranger' to work on the data I normalised myself, flavor='seurat' has worked okay. Therefore, I recommend people also encountering this error to stick with this second flavour, because as I understand it they utilise a similar methodology.

1reaction
Koncopdcommented, Dec 6, 2018

Hi, could you please try

sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata)

As highly_variable_genes expects logarithmized data.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Evaluation of tools for highly variable gene discovery from ...
With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a ...
Read more >
Identifying highly variable genes
... wrong with the previous exercise, you can load a clean version by issuing the following command: ... Look at how the most...
Read more >
Chapter 3 Feature selection | Basics of Single-Cell Analysis ...
The choice of genes to use in this calculation has a major impact on the ... per gene and to select an appropriate...
Read more >
difference between highly variable genes and marker ... - GitHub
This is due to the nature of how analysis is performed and relates to the issue of p values from cluster vs cluster...
Read more >
scRNA-seq: Identify highly variable genes - YouTube
In this lecture you will learn-Why do we need to find highly variable genes -What kind of mean-variance relationship is there in scRNA-seq ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found