Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Detect if adata.X is log normalised or not.

See original GitHub issue

I wanted to know how the check ‘Did not modify X as it looks preprocessed already’ is working. In the code it is comparing counts of spliced and X. log_advised = np.allclose(adata.X[:10].sum(), adata.layers['spliced'][:10].sum())

Can you comment on why X would be log normalised if these counts are equal?

Issue Analytics

State:
Created 5 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

VolkerBergencommented, Jun 5, 2019

Would be quite heuristic and covering only few cases. Since log is common preprocessing procedure, I’d assume that the user has done so if he merges his pre-analysed data.

The most important cases are already covered in filter_and_normalize, see https://github.com/theislab/scvelo/blob/d64a55d455cfaab0e61c8c41b7528216263720cd/scvelo/preprocessing/utils.py#L410

0reactions

saksham219commented, Jun 5, 2019

A simple method can check if data was log scaled. For checking if data is log scaled we can simply take the range of counts ; max(X) - min(X). If this range is greater than 100(another assumption) it can be said that the counts have not been log transformed I agree with your point that if the data was scaled to a unit variance beforehand, this method would fail.

Top Results From Across the Web

How to Differentiate Between Scaling, Normalization, and Log ...

Normalize data using MinMaxScaler , a transformer used when we want the feature values to lie within specific min and max values. It...

Normalize and compute highly variable genes

The raw data object will contain normalized, log-transformed values for visualiation. The original, raw (UMI) counts are stored in adata.obsm["raw_counts"] .

Question: average exression of normalized linear counts?

My question is if the correct way to do it would be from the normalized linear (not log transformed) counts and then tranform...

Log normalization | Python - DataCamp

What is log normalization? Log normalization is a method for standardizing your data that can be useful when you have a particular column...

Seurat part 3 – Data normalization and PCA - NGS Analysis

The parameters here identify ~2,000 variable genes, and represent typical parameter settings for UMI data that is normalized to a total of 1e4...