sc.pp.highly_variable_genes: overflow encountered
See original GitHub issueEnv:
- Ubuntu 16.04
- python 3.7
- pandas 0.25.0
- scanpy 1.4.4.post1
I have an AnnData object called adata
. The maximum value in the count matrix adata.X
is 3701.
When I do
sc.pp.highly_variable_genes(adata)
I get the following error
/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scipy/sparse/data.py:132: RuntimeWarning: overflow encountered in expm1
result = op(self._deduped_data())
/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scipy/sparse/data.py:132: RuntimeWarning: invalid value encountered in expm1
result = op(self._deduped_data())
/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scanpy/preprocessing/_utils.py:18: RuntimeWarning: overflow encountered in square
var = (mean_sq - mean**2) * (X.shape[0]/(X.shape[0]-1))
/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scanpy/preprocessing/_utils.py:18: RuntimeWarning: invalid value encountered in subtract
var = (mean_sq - mean**2) * (X.shape[0]/(X.shape[0]-1))
/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scanpy/preprocessing/_highly_variable_genes.py:86: RuntimeWarning: overflow encountered in log1p
mean = np.log1p(mean)
/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scanpy/preprocessing/_highly_variable_genes.py:86: RuntimeWarning: invalid value encountered in log1p
mean = np.log1p(mean)
Traceback (most recent call last):
File "../../scvi/scvi_adata.py", line 75, in <module>
sc.pp.highly_variable_genes(adata)
File "/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 257, in highly_variable_genes
flavor=flavor)
File "/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 92, in _highly_variable_genes_single_batch
df['mean_bin'] = pd.cut(df['means'], bins=n_bins)
File "/home/sfleming/anaconda3/envs/scvi/lib/python3.7/site-packages/pandas/core/reshape/tile.py", line 233, in cut
"cannot specify integer `bins` when input data " "contains infinity"
ValueError: cannot specify integer `bins` when input data contains infinity
Indeed, if I do np.expm1(3701)
I get an overflow.
I think it will be necessary to come up with a way to calculate highly variable genes without doing expm1
on the raw counts, due to this overflow issue.
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
sc.pp.highly_variable_genes: overflow encountered · Issue #763
Env: Ubuntu 16.04 python 3.7 pandas 0.25.0 scanpy 1.4.4.post1 I have an AnnData object called adata. The maximum value in the count matrix ......
Read more >scanpy.pp.highly_variable_genes - Read the Docs
This means that for each bin of mean expression, highly variable genes are selected. For [Stuart19], a normalized variance for each gene is...
Read more >scanpy highly variable genes - python - Stack Overflow
This is an issue with skmisc , according to this you should "try installing numpy+mkl before any other packages".
Read more >scanpy.pp.highly_variable_genes and “raise KeyError” - scverse
Hi, I am using the data that was transformed from Seurat to Scanpy following the official guidence. Everything works fine.
Read more >scanpy_03_integration
Detect variable genes in each dataset separately using the batch_key parameter. In [6]:. sc.pp.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Sorry, just realizing that this function expects logarithmized data. My fault.
Still the error message could be a lot better. I’ve made the same mistake, it’s easy to forget to log the data.
On Fri 2 Aug 2019 at 23:36, Stephen Fleming notifications@github.com wrote: