question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Detect if adata.X is log normalised or not.

See original GitHub issue

Hi

I wanted to know how the check ‘Did not modify X as it looks preprocessed already’ is working. In the code it is comparing counts of spliced and X. log_advised = np.allclose(adata.X[:10].sum(), adata.layers['spliced'][:10].sum())

Can you comment on why X would be log normalised if these counts are equal?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
VolkerBergencommented, Jun 5, 2019

Would be quite heuristic and covering only few cases. Since log is common preprocessing procedure, I’d assume that the user has done so if he merges his pre-analysed data.

The most important cases are already covered in filter_and_normalize, see https://github.com/theislab/scvelo/blob/d64a55d455cfaab0e61c8c41b7528216263720cd/scvelo/preprocessing/utils.py#L410

0reactions
saksham219commented, Jun 5, 2019

A simple method can check if data was log scaled. For checking if data is log scaled we can simply take the range of counts ; max(X) - min(X). If this range is greater than 100(another assumption) it can be said that the counts have not been log transformed I agree with your point that if the data was scaled to a unit variance beforehand, this method would fail.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Differentiate Between Scaling, Normalization, and Log ...
Normalize data using MinMaxScaler , a transformer used when we want the feature values to lie within specific min and max values. It...
Read more >
Normalize and compute highly variable genes
The raw data object will contain normalized, log-transformed values for visualiation. The original, raw (UMI) counts are stored in adata.obsm["raw_counts"] .
Read more >
Question: average exression of normalized linear counts?
My question is if the correct way to do it would be from the normalized linear (not log transformed) counts and then tranform...
Read more >
Log normalization | Python - DataCamp
What is log normalization? Log normalization is a method for standardizing your data that can be useful when you have a particular column...
Read more >
Seurat part 3 – Data normalization and PCA - NGS Analysis
The parameters here identify ~2,000 variable genes, and represent typical parameter settings for UMI data that is normalized to a total of 1e4...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found