question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fatal error on MarkDuplicates

See original GitHub issue

Bug Report

I am getting a fatal error with MarkDuplicates when trying to mark duplicated on a BWA generated BAM file. I have tried with version 2.19.2 installed by home-brew and a freshly built version from GitHub, with identical output. I am still using the old command line syntax.

Affected tool(s)

MarkDuplicates

java -jar picard.jar MarkDuplicates INPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam OUTPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam METRICS_FILE=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt

Affected version(s)

2.19.2 and latest on GitHub

Description

The output is

**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
**********    MarkDuplicates -INPUT /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam -OUTPUT /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam -METRICS_FILE /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt
**********


15:20:25.548 INFO  NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/nuin/src/picard/build/libs/picard.jar!/com/intel/gkl/native/libgkl_compression.dylib
[Thu May 09 15:20:25 MDT 2019] MarkDuplicates INPUT=[/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam] OUTPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam METRICS_FILE=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt    MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=true DUPLEX_UMI=false ADD_PG_TAG_TO_READS=true REMOVE_DUPLICATES=false ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Thu May 09 15:20:25 MDT 2019] Executing as nuin@Paulos-iMac-Pro.local on Mac OS X 10.14.4 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_211-b12; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.20.0-1-gc4dff1c-SNAPSHOT
INFO	2019-05-09 15:20:25	MarkDuplicates	Start of doWork freeMemory: 1014138720; totalMemory: 1029177344; maxMemory: 15271460864
INFO	2019-05-09 15:20:25	MarkDuplicates	Reading input file and constructing read end information.
INFO	2019-05-09 15:20:25	MarkDuplicates	Will retain up to 55331379 data points before spilling to disk.
INFO	2019-05-09 15:20:30	MarkDuplicates	Read     1,000,000 records.  Elapsed time: 00:00:04s.  Time for last 1,000,000:    4s.  Last read position: chr9:98,009,629
INFO	2019-05-09 15:20:30	MarkDuplicates	Tracking 509 as yet unmatched pairs. 135 records in RAM.
INFO	2019-05-09 15:20:33	MarkDuplicates	Read     2,000,000 records.  Elapsed time: 00:00:07s.  Time for last 1,000,000:    3s.  Last read position: chr16:68,863,325
INFO	2019-05-09 15:20:33	MarkDuplicates	Tracking 387 as yet unmatched pairs. 52 records in RAM.
INFO	2019-05-09 15:20:34	MarkDuplicates	Read 2443004 records. 0 pairs never matched.
INFO	2019-05-09 15:20:36	MarkDuplicates	After buildSortedReadEndLists freeMemory: 1980921848; totalMemory: 2719481856; maxMemory: 15271460864
INFO	2019-05-09 15:20:36	MarkDuplicates	Will retain up to 477233152 duplicate indices before spilling to disk.
INFO	2019-05-09 15:20:37	MarkDuplicates	Traversing read pair information and detecting duplicates.
INFO	2019-05-09 15:20:38	MarkDuplicates	Traversing fragment information and detecting duplicates.
INFO	2019-05-09 15:20:38	MarkDuplicates	Sorting list of duplicate records.
INFO	2019-05-09 15:20:39	MarkDuplicates	After generateDuplicateIndexes freeMemory: 2873101640; totalMemory: 6727663616; maxMemory: 15271460864
INFO	2019-05-09 15:20:39	MarkDuplicates	Marking 629356 records as duplicates.
INFO	2019-05-09 15:20:39	MarkDuplicates	Found 81 optical duplicate clusters.
INFO	2019-05-09 15:20:39	MarkDuplicates	Reads are assumed to be ordered by: coordinate
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000141644ea7, pid=10096, tid=0x0000000000002603
#
# JRE version: Java(TM) SE Runtime Environment (8.0_211-b12) (build 1.8.0_211-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  [libgkl_compression2348235168616825397.dylib+0x6ea7]  deflate_medium+0x867
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/nuin/src/picard/build/libs/hs_err_pid10096.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Error log is attached. hs_err_pid10096.log

Expected behavior

A BAM file should be generated at the end.

Actual behavior

Truncated file is generated.

Any help appreciated.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

19reactions
mrnameless123commented, May 24, 2019

@nuin I have resolved the problem by adding the following parameters at the end of the script USE_JDK_DEFLATER=true USE_JDK_INFLATER=true

0reactions
skchroniclescommented, Oct 29, 2020

Hey @lbergelson,

Just as a side note, I ended up bundling the picardcloud.jar in my Dockerfile above (I just added a COPY statement to get the jar file into the container’s filesystem), and that also solved the problem (without the need to add USE_JDK_DEFLATER=true USE_JDK_INFLATER=true).

image

I am just going to use that JAR distribution instead.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Picard MarkDuplicates fatal error - Biostars
Hi, I'm running Picard MarkDuplicates on a sorted mapped BAM file. It's mapped with bwa mem, sorted with samtools sort. Pretty standard. I'm ......
Read more >
Fatal error detected by the Java Runtime Environment ... - GATK
We thought this might have been happening because the bam was unsorted, and MarkDuplicates expects coordinate- or query-sorted inputs. Eduardo ...
Read more >
Re: [Samtools-help] Picard MarkDuplicates memory error on ...
When I try to remove duplicates (really removing not just marking them) >> from this huge file with MarkDuplicates, I'm running into serious...
Read more >
Problem with running TopHat and MARKDuplicates on Galaxy
MarkDuplicates on data No.X - Mark duplicates BAM output (This happens to all my updated sample files) Fatal error: Exit code 1 ()...
Read more >
407. MarkDuplicates 0 pairs never matched - Legacy GATK ...
INFO 2019-07-28 10:15:16 MarkDuplicates Read 506,000,000 records. ... A fatal error has been detected by the Java Runtime Environment:.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found