Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Results interpretation - overestimation?

See original GitHub issue


I would like to have some help: I’m running CellBender on a “problematic” sample (trial of new nuclei isolation protocol , resulted in lower quality). We know that many cells need to be discarded from this sample (achieved with stringent filtering for nUMI and %mt). I decided to try CellBender to see if I could get a better estimation on the keepable cell for downstream analysis, but CellBender seems to keep all the cells. I’m not sure what actually is cell and what is ambient/low quality cell from the Cell ranger output, I used the following parameters:

cellbender remove-background \
>                  --input ./raw_S2R2/raw_feature_bc_matrix.h5 \
>                  --output ./raw_S2R2/cellbender_matrix.h5 \
>                  --expected-cells 800 \
>                  --total-droplets-included 6000 \
>                  --fpr 0.01 \
>                  --epochs 150

image Cattura

and got the attached pdf output. s2r2.pdf

Now my questions are:

  • What does this output mean? Is this sample not suitable for CellBender? or I just used the wrong parameters?

  • Since the outputs of cellranger and CellBender are comparable in the # of cells kept, can the output of CellBender be considered cleaner because it should have also removed some background from the new count matrix, or the results of CellBender are not trustable?

I hope it is clear and thanks very much in advance!

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

sjflemingcommented, Oct 31, 2022

Huh, I did not expect that… well, thanks for letting me know! I should keep that in mind in future…

giorgiatosonicommented, Nov 29, 2022

Oh I see… thanks for the explanation!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Overestimating Outcome Rates: Statistical Estimation When ...
When the true outcome rate is low ( < 20 percent ), using an outcome measure that has low-to-moderate reliability will generally result...
Read more >
Variability leads to overestimation of mean summaries
In all experiments, we found that mean size estimations were more erroneous for higher than smaller variance displays. More critically, there ...
Read more >
Calibration: the Achilles heel of predictive analytics
Conversely, underestimation occurs when the observed event rate is higher than the average predicted risk.
Read more >
Statistical Significance Filtering Overestimates Effects and ...
Whether in meta-analysis or single experiments, selecting results based on statistical significance leads to overestimated effect sizes, ...
Read more >
Overestimated and underestimated predictions in regression
In a regression problem, if the relationship between each predictor variable and the criterion variable is nonlinear, then the predictions may ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Post

No results found

github_iconTop Related Hashnode Post

No results found