question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to create duplex UMIs with AnnotateBamWithUmis

See original GitHub issue

I am working with data that uses two UMIs for paired end reads. One UMI was included as part of index 1 and the other as part of index 2. I’d like to annotate the RX field in my BAM file with both UMIs with a dash between, as in NNNNNNNN-NNNNNNNN. I see that CorrectUmis can handle duplex UMIs, such that it looks for the consensus sequence independently for each half. What would be the best way annotate a BAM with duplex UMIs in this case? It seems with AnnotateBamWithUmis you end up replacing the contents of the RX field when you run it twice.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
birnberacommented, Feb 24, 2021

Got it. My use case is different but it makes sense to do more aggressive quality filtering for variant detection. Thank you for your help on this!

0reactions
nh13commented, Feb 24, 2021

UmiAwareMarkDuplicatesWithMateCigar is mainly used to ensure that two or more reads from the same source molecule are not counted twice in downstream variant calling, but why not leverage the multiple observations for variant calling. I am mostly interested in using UMIs to detect groups where reads that original from the same source molecule are in the same group so that I can then create a consensus from each group of reads. The consensus has much lower error rate and sources of bias (depending on the UMI scheme) so can enable ultra-accurate variant calling (SNVs/indels/STRs/etc.).

In the case of GroupReadsWithUmi it so aggressively filters out read.

Well you can turn off filters and some additional hard-coded filters are being relaxed (see #648) but we’d welcome an opportunity to collaborate.

Read more comments on GitHub >

github_iconTop Results From Across the Web

AnnotateBamWithUmis | fgbio - Fulcrum Genomics
Annotates existing BAM files with UMIs (Unique Molecular Indices, aka Molecular IDs, Molecular barcodes) from separate FASTQ files. Takes an existing BAM file ......
Read more >
UMI Processing — Clara Parabricks v3.8 documentation
UMI data can be processed using workflow based on fgbio methods. The tools can be run in a standalone fashion or the whole...
Read more >
umi - com.fulcrumgenomics.umi - javadoc.io
class AnnotateBamWithUmis extends FgBioTool with LazyLogging ... Creates duplex consensus reads from SamRecords that have been grouped by their source ...
Read more >
A Universal Analysis Pipeline for Hybrid Capture-Based ...
UMI information was annotated via AnnotateBamWithUmis (fgbio). To use MergeBamAlignment (Picard) to generate UMI-annotated mapped bam files, ...
Read more >
fgbio - Scaladex
Build Status codecov Maven Central Bioconda Javadocs License Language ... Annotating/Extract Umis from read-level data: AnnotateBamWithUmis and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found