How to create duplex UMIs with AnnotateBamWithUmis
See original GitHub issueI am working with data that uses two UMIs for paired end reads. One UMI was included as part of index 1 and the other as part of index 2. I’d like to annotate the RX field in my BAM file with both UMIs with a dash between, as in NNNNNNNN-NNNNNNNN
. I see that CorrectUmis
can handle duplex UMIs, such that it looks for the consensus sequence independently for each half. What would be the best way annotate a BAM with duplex UMIs in this case? It seems with AnnotateBamWithUmis
you end up replacing the contents of the RX field when you run it twice.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
AnnotateBamWithUmis | fgbio - Fulcrum Genomics
Annotates existing BAM files with UMIs (Unique Molecular Indices, aka Molecular IDs, Molecular barcodes) from separate FASTQ files. Takes an existing BAM file ......
Read more >UMI Processing — Clara Parabricks v3.8 documentation
UMI data can be processed using workflow based on fgbio methods. The tools can be run in a standalone fashion or the whole...
Read more >umi - com.fulcrumgenomics.umi - javadoc.io
class AnnotateBamWithUmis extends FgBioTool with LazyLogging ... Creates duplex consensus reads from SamRecords that have been grouped by their source ...
Read more >A Universal Analysis Pipeline for Hybrid Capture-Based ...
UMI information was annotated via AnnotateBamWithUmis (fgbio). To use MergeBamAlignment (Picard) to generate UMI-annotated mapped bam files, ...
Read more >fgbio - Scaladex
Build Status codecov Maven Central Bioconda Javadocs License Language ... Annotating/Extract Umis from read-level data: AnnotateBamWithUmis and ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Got it. My use case is different but it makes sense to do more aggressive quality filtering for variant detection. Thank you for your help on this!
UmiAwareMarkDuplicatesWithMateCigar
is mainly used to ensure that two or more reads from the same source molecule are not counted twice in downstream variant calling, but why not leverage the multiple observations for variant calling. I am mostly interested in using UMIs to detect groups where reads that original from the same source molecule are in the same group so that I can then create a consensus from each group of reads. The consensus has much lower error rate and sources of bias (depending on the UMI scheme) so can enable ultra-accurate variant calling (SNVs/indels/STRs/etc.).Well you can turn off filters and some additional hard-coded filters are being relaxed (see #648) but we’d welcome an opportunity to collaborate.