no highly variable genes
See original GitHub issueI ran the newest Scanpy package’s
sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.05,
batch_key='batch')
It indeed gave me information about highly_variable_nbatches etc. But all the genes were labelled as not variable (‘False’).
Any ideas?
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
scanpy.pp.highly_variable_genes - Read the Docs
This means that for each bin of mean expression, highly variable genes are selected. For [Stuart19], a normalized variance for each gene is...
Read more >no highly variable genes · Issue #935 · scverse/scanpy - GitHub
highly_variable_genes() outputs an error due to bin boundaries being duplicated as genes were unexpressed. I reckon this is not actually a bug.
Read more >Evaluation of tools for highly variable gene discovery from ...
With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a ...
Read more >Chapter 3 Feature selection | Basics of Single-Cell Analysis ...
The simplest approach to feature selection is to select the most variable genes based on their expression across the population. This assumes that...
Read more >Identifying highly variable genes
Identifying highly variable genes ... We'll start with the count matrix that was prepared in the previous exercise. ... Look at how the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I just checked again… and it’s not exactly the same… if you select
n_top_genes
then, you will get the top genes shared by the most batches. If you select thresholds for mean and dispersion, you will use these thresholds against the mean dispersion and mean mean across all batches. And those can be lower than the thresholds if HVGs are not shared between many batches. So to be safe, you can go with selectingn_top_genes
.OK. So If I understand it correct, when batch_key is used, adata.var[‘highly_variable’] is just adata.var[‘highly_variable_genes_intersection’] ?