Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

High level of background: misassigned droplets

See original GitHub issue

Dear coders of CellBender,

First of all thank you for providing such useful tool! I actually found out that this can be easily run on a free GPU powered google colab, which is nice for those not having access to one!

While the tool works flawlessly for several of my datasets, some particular 10X runs coming from the same lab shows some issues: When looking at the log pdf, a lot of droplets from the empty droplet plateau are misassigned to cells, whereas I am rather keen to believe that they should be empty.

One particularity of these datasets is that they all share a very high amount of background (for the following example, the plateau is around 2000 UMIs!):

The log at the start of the run is the following:

cellbender remove-background --input drive/My Drive/ML10_raw_feature_bc_matrix.h5 --output ML10_150_output.h5 --cuda --expected-cells 21000 --total-droplets-included 50000 --epochs 150
cellbender:remove-background: 2020-02-12 12:44:41
cellbender:remove-background: Running remove-background
cellbender:remove-background: Loading data from file drive/My Drive/ML10_raw_feature_bc_matrix.h5
cellbender:remove-background: CellRanger v3 format
cellbender:remove-background: Trimming dataset for inference.
cellbender:remove-background: Prior on counts in empty droplets is 1807
cellbender:remove-background: Prior on counts for cells is 14313
cellbender:remove-background: Excluding barcodes with counts below 1445
cellbender:remove-background: Using 21000 probable cell barcodes, plus an additional 29000 barcodes, and 28217 empty droplets.
cellbender:remove-background: Running inference...
...

Prior on counts in empty droplets seems reasonable to me, or should I choose an higher one?

The output log pdf is as following: ML10_output

Following the documentation, I decided to run the analysis by increasing the number of z-dims, z-layers and epochs with the following command:

cellbender remove-background --input drive/My Drive/ML10_raw_feature_bc_matrix.h5 --output ML10_highbckgrd_output.h5 --cuda --expected-cells 21000 --total-droplets-included 50000 --epochs 300 --z-dim 200 --z-layers 1000

But this did not improved anything, and actually training shows weird behaviour probably due to the too high parameters: ML10_high_output

Am I missing something? a parameter that could influence the misassignment?

Issue Analytics

State:
Created 4 years ago
Comments:15 (6 by maintainers)

Top GitHub Comments

1reaction

sjflemingcommented, Feb 19, 2020

It’s not clear if the v2 branch will help with this issue (yet), although once v2 is done, it will be a significant improvement in a number of ways.

We hope that cell calling is one of those improvements… but v2 is not complete yet. I think you may see some improvement in the current state of v2, but I am working on a few more ways to address this.

For now, what you can count on is: remove-background v1 does not leave cells out. All cells will be called cells. But it will pick up some empty droplets. This is worse in some datasets than others. Currently, the best practice is to filter those out based on other QC metrics downstream.

1reaction

deevdevil88commented, Feb 18, 2020

@sjfleming yes i saw that and we are using this now. Also we have got a collaborator who is letting me run the samples on their GPU. I have now been using this and samples run fine 😃

Thanks Devika

Top Results From Across the Web

Unsupervised removal of systematic background noise from droplet ...

Abstract. Droplet-based single-cell assays, including scRNA-seq, snRNA-seq, and CITE-seq, produce a significant amount of background noise counts, ...

SoupX removes ambient RNA contamination from droplet ...

Contamination is preferentially removed from genes closest to the background expression (i.e., genes with low levels of expression), meaning ...

On-chip background dilution in droplets with high ... - PubMed

The capacities of the picoinjection and the droplet split to dilute the background fluorescent signal in the droplets have been characterized.

Supervised discriminant analysis for droplet micro ...

Based on the SDA, we successfully discriminate bivariant droplets of ... to minimize the background noise level in order to achieve a high ......

CellBender remove-background: A deep generative model for ...

PDF | Droplet-based scRNA-seq assays are known to produce a significant amount of background RNA counts, the hallmark of which is non-zero ...