question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Doublet filtering function

See original GitHub issue

Hi,

I tried the DoubletDetection Python library on my data and got some interesting result. As it can be run directly on a numpy array of count matrix (adata.X), I thought it would be an interesting feature for scanpy.

clf = doubletdetection.BoostClassifier() 
doublets = clf.fit(adata.X).predict()
adata.obs['doublet'] = pd.Categorical(doublets.astype(bool))

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:16 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
falexwolfcommented, Jun 22, 2018

Hi @JonathanShor,

you don’t need to create a custom API. One point of Scanpy is to provide convenient access via anndata to many single-cell packages around. The only thing needed for that is to provide a very simple interface like this or this or several of the other tools… Simply click on the GitHub links in the Scanpy docs…

If your package works reliably, both the restrictions you mention should in principle not prevent adding your package. Of course, in the future, we want all elements of Scanpy to scale to millions of cells, not just the core tools. But for a lot of people, it’s right now helpful to have a large number of tools available also for relatively small datasets.

The only problem is to avoid cluttering the Scanpy API with virtually any tool there is. Tools in the API should have passed a certain quality check.

Doublet detection is a difficult problem. Already last autumn, we played around with @swolock 's tool but didn’t end up using it - it was good, but in our situation, it didn’t seem to apply (are you eventually going to distribute a package for it @swolock ?). I myself quickly wrote a tool, too, but it didn’t work well. Just yesterday, this appeared. Then there is also this on “empty cell detection”. There are more tools out there, I think…

What I mean is: computationally detecting doublets is still something where the field has not agreed on a consensus. Just like batch correction. Therefore, I would not add a tool tl.doublet_detection or tl.detect_doublets to the API at this stage.

There are two options. Either we create a .beta module of the API for tools that don’t even have a preprint and add your tool and similar cases in the future there. We could make a separate page for that entitled Cutting Edge Beta Tools which advertises these tools for people to try out and play around with it. When you have a solid preprint and/or publication or if you think that your tool should go in the main API anyways 😄, we should add your package as tl.detect_doublets_ONEWORDDESCRIBGINGYOURALGORITHM

@flying-sheep @gokceneraslan @fidelram @dawe anyone opinions on such cases?

0reactions
pinin4fjordscommented, Oct 23, 2019

Thanks @fidelram, that will run the whole Scrublet workflow so will certainly do the trick. But I’d prefer a more Scanpy-integrated approach, which I think I can see how to do from @swolock’s fork.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Chapter 8 Doublet detection | Advanced Single-Cell Analysis ...
The findDoubletClusters() function from the scDblFinder package identifies clusters with expression profiles lying between two other clusters (Bach et al.
Read more >
DoubletFinder: Doublet Detection in Single-Cell RNA ...
First, DoubletFinder simulates artificial doublets from existing scRNA-seq data by averaging the gene expression profiles of random pairs of ...
Read more >
Doublet Discrimination - Flow Cytometry Guide | Bio-Rad
Doublet discrimination ensures you only count single cells in your analysis. This is important in cell sorting, cell cycle and DNA analysis. Find...
Read more >
Filter Doublets From an ArchRProject — filterDoublets • ArchR
This filterRatio allows you to apply a consistent filter across multiple different samples that may have different percentages of doublets because they were...
Read more >
Doublet filtering function · Issue #173 · scverse/scanpy - GitHub
Doublet filtering function #173 ... I tried the DoubletDetection Python library on my data and got some interesting result.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found