Detect if adata.X is log normalised or not.
See original GitHub issueHi
I wanted to know how the check ‘Did not modify X as it looks preprocessed already’ is working. In the code it is comparing counts of spliced and X.
log_advised = np.allclose(adata.X[:10].sum(), adata.layers['spliced'][:10].sum())
Can you comment on why X would be log normalised if these counts are equal?
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
How to Differentiate Between Scaling, Normalization, and Log ...
Normalize data using MinMaxScaler , a transformer used when we want the feature values to lie within specific min and max values. It...
Read more >Normalize and compute highly variable genes
The raw data object will contain normalized, log-transformed values for visualiation. The original, raw (UMI) counts are stored in adata.obsm["raw_counts"] .
Read more >Question: average exression of normalized linear counts?
My question is if the correct way to do it would be from the normalized linear (not log transformed) counts and then tranform...
Read more >Log normalization | Python - DataCamp
What is log normalization? Log normalization is a method for standardizing your data that can be useful when you have a particular column...
Read more >Seurat part 3 – Data normalization and PCA - NGS Analysis
The parameters here identify ~2,000 variable genes, and represent typical parameter settings for UMI data that is normalized to a total of 1e4...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Would be quite heuristic and covering only few cases. Since log is common preprocessing procedure, I’d assume that the user has done so if he merges his pre-analysed data.
The most important cases are already covered in
filter_and_normalize
, see https://github.com/theislab/scvelo/blob/d64a55d455cfaab0e61c8c41b7528216263720cd/scvelo/preprocessing/utils.py#L410A simple method can check if data was log scaled. For checking if data is log scaled we can simply take the range of counts ; max(X) - min(X). If this range is greater than 100(another assumption) it can be said that the counts have not been log transformed I agree with your point that if the data was scaled to a unit variance beforehand, this method would fail.