question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

High level of background: misassigned droplets

See original GitHub issue

Dear coders of CellBender,

First of all thank you for providing such useful tool! I actually found out that this can be easily run on a free GPU powered google colab, which is nice for those not having access to one!

While the tool works flawlessly for several of my datasets, some particular 10X runs coming from the same lab shows some issues: When looking at the log pdf, a lot of droplets from the empty droplet plateau are misassigned to cells, whereas I am rather keen to believe that they should be empty.

One particularity of these datasets is that they all share a very high amount of background (for the following example, the plateau is around 2000 UMIs!): image

The log at the start of the run is the following:

cellbender remove-background --input drive/My Drive/ML10_raw_feature_bc_matrix.h5 --output ML10_150_output.h5 --cuda --expected-cells 21000 --total-droplets-included 50000 --epochs 150
cellbender:remove-background: 2020-02-12 12:44:41
cellbender:remove-background: Running remove-background
cellbender:remove-background: Loading data from file drive/My Drive/ML10_raw_feature_bc_matrix.h5
cellbender:remove-background: CellRanger v3 format
cellbender:remove-background: Trimming dataset for inference.
cellbender:remove-background: Prior on counts in empty droplets is 1807
cellbender:remove-background: Prior on counts for cells is 14313
cellbender:remove-background: Excluding barcodes with counts below 1445
cellbender:remove-background: Using 21000 probable cell barcodes, plus an additional 29000 barcodes, and 28217 empty droplets.
cellbender:remove-background: Running inference...
...

Prior on counts in empty droplets seems reasonable to me, or should I choose an higher one?

The output log pdf is as following: ML10_output

Following the documentation, I decided to run the analysis by increasing the number of z-dims, z-layers and epochs with the following command:

cellbender remove-background --input drive/My Drive/ML10_raw_feature_bc_matrix.h5 --output ML10_highbckgrd_output.h5 --cuda --expected-cells 21000 --total-droplets-included 50000 --epochs 300 --z-dim 200 --z-layers 1000

But this did not improved anything, and actually training shows weird behaviour probably due to the too high parameters: ML10_high_output

Am I missing something? a parameter that could influence the misassignment?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:15 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
sjflemingcommented, Feb 19, 2020

It’s not clear if the v2 branch will help with this issue (yet), although once v2 is done, it will be a significant improvement in a number of ways.

We hope that cell calling is one of those improvements… but v2 is not complete yet. I think you may see some improvement in the current state of v2, but I am working on a few more ways to address this.

For now, what you can count on is: remove-background v1 does not leave cells out. All cells will be called cells. But it will pick up some empty droplets. This is worse in some datasets than others. Currently, the best practice is to filter those out based on other QC metrics downstream.

1reaction
deevdevil88commented, Feb 18, 2020

@sjfleming yes i saw that and we are using this now. Also we have got a collaborator who is letting me run the samples on their GPU. I have now been using this and samples run fine 😃

Thanks Devika

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unsupervised removal of systematic background noise from droplet ...
Abstract. Droplet-based single-cell assays, including scRNA-seq, snRNA-seq, and CITE-seq, produce a significant amount of background noise counts, ...
Read more >
SoupX removes ambient RNA contamination from droplet ...
Contamination is preferentially removed from genes closest to the background expression (i.e., genes with low levels of expression), meaning ...
Read more >
On-chip background dilution in droplets with high ... - PubMed
The capacities of the picoinjection and the droplet split to dilute the background fluorescent signal in the droplets have been characterized.
Read more >
Supervised discriminant analysis for droplet micro ...
Based on the SDA, we successfully discriminate bivariant droplets of ... to minimize the background noise level in order to achieve a high ......
Read more >
CellBender remove-background: A deep generative model for ...
PDF | Droplet-based scRNA-seq assays are known to produce a significant amount of background RNA counts, the hallmark of which is non-zero ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found