Doublet filtering function
See original GitHub issueHi,
I tried the DoubletDetection
Python library on my data and got some interesting result. As it can be run directly on a numpy array of count matrix (adata.X
), I thought it would be an interesting feature for scanpy
.
clf = doubletdetection.BoostClassifier()
doublets = clf.fit(adata.X).predict()
adata.obs['doublet'] = pd.Categorical(doublets.astype(bool))
Issue Analytics
- State:
- Created 5 years ago
- Comments:16 (10 by maintainers)
Top Results From Across the Web
Chapter 8 Doublet detection | Advanced Single-Cell Analysis ...
The findDoubletClusters() function from the scDblFinder package identifies clusters with expression profiles lying between two other clusters (Bach et al.
Read more >DoubletFinder: Doublet Detection in Single-Cell RNA ...
First, DoubletFinder simulates artificial doublets from existing scRNA-seq data by averaging the gene expression profiles of random pairs of ...
Read more >Doublet Discrimination - Flow Cytometry Guide | Bio-Rad
Doublet discrimination ensures you only count single cells in your analysis. This is important in cell sorting, cell cycle and DNA analysis. Find...
Read more >Filter Doublets From an ArchRProject — filterDoublets • ArchR
This filterRatio allows you to apply a consistent filter across multiple different samples that may have different percentages of doublets because they were...
Read more >Doublet filtering function · Issue #173 · scverse/scanpy - GitHub
Doublet filtering function #173 ... I tried the DoubletDetection Python library on my data and got some interesting result.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @JonathanShor,
you don’t need to create a custom API. One point of Scanpy is to provide convenient access via
anndata
to many single-cell packages around. The only thing needed for that is to provide a very simple interface like this or this or several of the other tools… Simply click on the GitHub links in the Scanpy docs…If your package works reliably, both the restrictions you mention should in principle not prevent adding your package. Of course, in the future, we want all elements of Scanpy to scale to millions of cells, not just the core tools. But for a lot of people, it’s right now helpful to have a large number of tools available also for relatively small datasets.
The only problem is to avoid cluttering the Scanpy API with virtually any tool there is. Tools in the API should have passed a certain quality check.
Doublet detection is a difficult problem. Already last autumn, we played around with @swolock 's tool but didn’t end up using it - it was good, but in our situation, it didn’t seem to apply (are you eventually going to distribute a package for it @swolock ?). I myself quickly wrote a tool, too, but it didn’t work well. Just yesterday, this appeared. Then there is also this on “empty cell detection”. There are more tools out there, I think…
What I mean is: computationally detecting doublets is still something where the field has not agreed on a consensus. Just like batch correction. Therefore, I would not add a tool
tl.doublet_detection
ortl.detect_doublets
to the API at this stage.There are two options. Either we create a
.beta
module of the API for tools that don’t even have a preprint and add your tool and similar cases in the future there. We could make a separate page for that entitled Cutting Edge Beta Tools which advertises these tools for people to try out and play around with it. When you have a solid preprint and/or publication or if you think that your tool should go in the main API anyways 😄, we should add your package astl.detect_doublets_ONEWORDDESCRIBGINGYOURALGORITHM
…@flying-sheep @gokceneraslan @fidelram @dawe anyone opinions on such cases?
Thanks @fidelram, that will run the whole Scrublet workflow so will certainly do the trick. But I’d prefer a more Scanpy-integrated approach, which I think I can see how to do from @swolock’s fork.