question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Assertion error: reads out of order

See original GitHub issue

I’m running into this error with GroupReadsByUmi after using UmiAwareMarkDuplicatesWithMateCigar. I followed the steps from the closed issue #87 , but am having slightly different issues than that user.

I’ve run MergeBamAlignment on the unaligned and aligned bam, to create a ‘UMI-marked-bam’. Then, I can take that UMI-marked-bam straight to GroupReadsByUmi and it works. However, if I use UmiAwareMarkDuplicatesWithMateCigar first (to create a duplicate-marked-bam, duplicates only marked, not removed), it fails with the following error:

[2018/06/30 01:06:30 | GroupReadsByUmi | Info] Filtering and sorting input.
[2018/06/30 01:06:42 | GroupReadsByUmi | Info] Sorted     1,000,000 records.  Elapsed time: 00:00:12s.  Time for last 1,000,000:   12s.  Last read position: 9:21,968,715
[2018/06/30 01:06:55 | GroupReadsByUmi | Info] Sorted     2,000,000 records.  Elapsed time: 00:00:24s.  Time for last 1,000,000:   12s.  Last read position: 22:29,091,612
[2018/06/30 01:06:55 | GroupReadsByUmi | Info] Accepted 2,049,604 reads for grouping.
[2018/06/30 01:06:55 | GroupReadsByUmi | Info] Filtered out 388,430 non-PF reads.
[2018/06/30 01:06:55 | GroupReadsByUmi | Info] Filtered out 117,332 reads that were not part of a high confidence FR mapped read pair.
[2018/06/30 01:06:55 | GroupReadsByUmi | Info] Filtered out 0 reads that contained one or more Ns in their UMIs.
[2018/06/30 01:06:55 | GroupReadsByUmi | Info] Assigning reads to UMIs and outputting.
[2018/06/30 01:06:56 | FgBioMain | Info] GroupReadsByUmi failed. Elapsed time: 0.49 minutes.
Exception in thread "main" java.lang.AssertionError: assertion failed: Reads out of order @ MA0337:1:1107:25428:6890/2 + MA0337:1:1105:21989:24121/2
        at scala.Predef$.assert(Predef.scala:219)
        at com.fulcrumgenomics.umi.GroupReadsByUmi.$anonfun$execute$16(GroupReadsByUmi.scala:486)
        at com.fulcrumgenomics.umi.GroupReadsByUmi.$anonfun$execute$16$adapted(GroupReadsByUmi.scala:485)
        at scala.collection.immutable.List.foreach(List.scala:389)
        at com.fulcrumgenomics.umi.GroupReadsByUmi.execute(GroupReadsByUmi.scala:485)
        at com.fulcrumgenomics.cmdline.FgBioMain.makeItSo(FgBioMain.scala:99)
        at com.fulcrumgenomics.cmdline.FgBioMain.makeItSoAndExit(FgBioMain.scala:80)
        at com.fulcrumgenomics.cmdline.FgBioMain$.main(FgBioMain.scala:48)
        at com.fulcrumgenomics.cmdline.FgBioMain.main(FgBioMain.scala)```

I've tried both just marking duplicates and setting REMOVE_DUPLICATES=true. 

I then tried to run ValidateSame on the 'duplicate-marked-bam' and it says:
" No errors found"

I've tried sorting by coordinate and query name, neither helped. 

I've tried RevertSam with SANITIZE=true and all the REMOVE/RESTORE options set to false (from closed issue #87) and still get this same error. Is there something obvious that I'm missing?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
sheenamtcommented, Jul 2, 2018

@nh13 The slashes don’t appear when I look at the bam:

samtools view output/165R05_E01_MONCv1_MA0337/165R05_E01_MONCv1_MA0337.dedup.bam | grep MA0337:1:1107:25428:6890
MA0337:1:1107:25428:6890        83      1       156843474       60      101M    =       156843467       -108    CATCCCCTTCTCTGTGGATGGGCAGCCGGCACCGTCTCTGCGCTGGCTCTTCAATGGCTCCGTGCTCAATGAGACCAGCTTCATCTTCACTGAGTTCCTGG  CGGCGHFG?F313GBAHGHHHGEGGEEEE?1?GHHHCEEE0GHGHFGGGGBHGHG2E2EE?G3GBFBFHGHHFG4HGHGGGDGF5FGGFFFFFFCA3AAAA  MC:Z:92M        MD:Z:101        RG:Z:165R05_E01_MONCv1_MA0337   MI:Z:CCCCCCCCC  NM:i:0MQ:i:60  UQ:i:0  AS:i:101        QX:Z:55BAEEGGA  RX:Z:CCCCCCCCC

But I think I do see the issue. Grepping for either of those reads only produces one line in my ‘dedup.bam’ file, but looking for them in the umi-bam (that is input for marking duplicates), there are two reads. So the issue is that UmiAwareMarkDuplicatesWithMateCigar is removing one of the reads.

0reactions
nh13commented, Dec 19, 2020

@leila0210 is this still an issue?

Read more comments on GitHub >

github_iconTop Results From Across the Web

assert - How to handle AssertionError in Python and find out ...
Two issues. First, if you are having trouble identifying where the exception is happening in your try..except , that's a sign your try..except ......
Read more >
Python | Assertion Error - GeeksforGeeks
Assertion Error Assertion is a programming concept used while writing a code where the user declares a condition to be true using assert ......
Read more >
AssertionError (Java Platform SE 7 ) - Oracle Help Center
Constructs an AssertionError with its detail message derived from the specified double , which is converted to a string as defined in section...
Read more >
Assertion error when you run internal query in batch mode for ...
Assume that you have full-text index on a computed large object (LOB) column for a table in SQL Server 2019. When batch mode...
Read more >
Asserting Expectations - The Debugging Book
By checking our observations against our expectations, we can find out when and ... Assertion failed: 2 + 2 == 5, function main,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found