Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

highly_variable_genes - issue

See original GitHub issue

Hi there,

While running sc.pp.highly_variable_genes(adata.X) I got the following error:

AttributeError: X not found

I then ran sc.pp.highly_variable_genes(adata) and got the following:

ValueError: Bin edges must be unique: array([nan, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf,inf, inf, inf, inf, inf, inf, inf, inf]). You can drop duplicate edges by setting the duplicates kwarg

The older sc.pp.filter_genes_dispersion(adata.X) works fine.

Do you know how to fix this?

Thank you!

Info: scanpy==1.3.4 anndata==0.6.13 numpy==1.15.3 scipy==1.1.0 pandas==0.23.4 scikit-learn==0.20.0 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1

Issue Analytics

State:
Created 5 years ago
Comments:21 (7 by maintainers)

Top GitHub Comments

2reactions

kanefoscommented, Dec 27, 2020

I have an AnnData object whose .X matrix has been transformed by size factor division, +1 and log. Subsequent sc.pp.highly_variable_genes(dataset, flavor='cell_ranger', n_top_genes=1000) yields the ValueError: Bin edges must be unique: ... You can drop duplicate edges by setting the 'duplicates' kwarg error discussed above. Transformation to a sparse matrix did not alleviate the error, and neither did any other solutions suggested.

Edit: However! While I could not get flavor='cell_ranger' to work on the data I normalised myself, flavor='seurat' has worked okay. Therefore, I recommend people also encountering this error to stick with this second flavour, because as I understand it they utilise a similar methodology.

1reaction

Koncopdcommented, Dec 6, 2018

Hi, could you please try

sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata)

As highly_variable_genes expects logarithmized data.

Top Results From Across the Web

Evaluation of tools for highly variable gene discovery from ...

With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a ...

Identifying highly variable genes

... wrong with the previous exercise, you can load a clean version by issuing the following command: ... Look at how the most...

Chapter 3 Feature selection | Basics of Single-Cell Analysis ...

The choice of genes to use in this calculation has a major impact on the ... per gene and to select an appropriate...

difference between highly variable genes and marker ... - GitHub

This is due to the nature of how analysis is performed and relates to the issue of p values from cluster vs cluster...

scRNA-seq: Identify highly variable genes - YouTube

In this lecture you will learn-Why do we need to find highly variable genes -What kind of mean-variance relationship is there in scRNA-seq ......

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

highly_variable_genes - issue

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

sc.pl.stacked_violin: IndexError, list index out of range

Automatic gene symbols lookup in plotting