question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

--error-correct-cell behaviour

See original GitHub issue

Hi,

Thanks for the tool. I have a problem with --error-correct-cell, I don’t know if I misunderstood something.

So first I do a whitelist: umi_tools whitelist --extract-method=regex --bc-pattern='(?P<discard_1>TTCG){s<=1}(?P<cell_1>.{15,17})(?P<discard_2>TGCTTACGCTACGGAACGA){s<=3}(?P<umi_1>.{9})' --stdin=input.fastq -S BBC.txt Which give me in particular this line in the whitelist output file: CTGTTGATCACCCGTA CTGTTGATCACCCGTAT And then I do an extract: umi_tools extract --extract-method=regex --bc-pattern='(?P<discard_1>TTCG){s<=1}(?P<cell_1>.{15,17})(?P<discard_2>TGCTTACGCTACGGAACGA){s<=3}(?P<umi_1>.{9})' --whitelist BBC.txt --error-correct-cell --stdin=input.fastq -S input.BBC.fastq And in the output FASTQ file, I have this read header: @M05218:191:000000000-D7R5H:1:1102:14341:19429_CTGTTGATCACCCGTAT_CCTCAAACG 1:N:0:2 where I was expecting the BC to be corrected to “CTGTTGATCACCCGTA” but it actually has the uncorrect form (that was actually found in the read). Is it the expected behaviour? Did I misunderstand something?

Cheers, Mathieu

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
TomSmithCGATcommented, Apr 30, 2020

There is indeed a redundancy between --whitelist and --filter-cell-barcode, where the former is a path to a whitelist file and the later a switch to filter against this. We can leave this issue open as a reminder to remove the redundant option (suggest --whitelist to switch on --filter-cell-barcode and hide --filter-cell-barcode option to not break any users current pipelines). error-correct-cell is a separate option however since one might wish to only retain cells that perfectly match the whitelist

0reactions
mbahincommented, Apr 30, 2020

Actually it looks like there is redundancy in the 3 options “–whitelist”, “-error-correct-cell” and “–filter-cell-barcode” no?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Extract UMI from fastq — UMI-tools documentation
Error correct cell barcodes to the whitelist (see --whitelist ) ... Where both patterns match, the default behaviour is to discard both reads....
Read more >
Error correction under different situations. (A) For the 5 cells in ...
(B) Error correction occurred (in 5 or 7 cells in a row) under all conditions examined. The title column shows the flat, reversed...
Read more >
UMI-tools/extract.py at master - GitHub
Error correct cell barcodes to the whitelist (see ``--whitelist``) ... These options have not been extensively testing to ensure behaviour is as expected....
Read more >
Fixing problems with cell lines - PMC - NCBI - NIH
Even in basic research, use of mistaken cell lines can hinder progress because of variations in cell behavior among different cell types.
Read more >
Interpreting Cell Ranger Web Summary Files for Single Cell ...
indicate low sample quality or loss of single-cell behavior. This can be due to a wetting failure, premature cell lysis, or low cell...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found