Fatal error on MarkDuplicates
See original GitHub issueBug Report
I am getting a fatal error with MarkDuplicates when trying to mark duplicated on a BWA generated BAM file. I have tried with version 2.19.2 installed by home-brew and a freshly built version from GitHub, with identical output. I am still using the old command line syntax.
Affected tool(s)
MarkDuplicates
java -jar picard.jar MarkDuplicates INPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam OUTPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam METRICS_FILE=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt
Affected version(s)
2.19.2 and latest on GitHub
Description
The output is
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
********** MarkDuplicates -INPUT /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam -OUTPUT /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam -METRICS_FILE /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt
**********
15:20:25.548 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/nuin/src/picard/build/libs/picard.jar!/com/intel/gkl/native/libgkl_compression.dylib
[Thu May 09 15:20:25 MDT 2019] MarkDuplicates INPUT=[/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam] OUTPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam METRICS_FILE=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=true DUPLEX_UMI=false ADD_PG_TAG_TO_READS=true REMOVE_DUPLICATES=false ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Thu May 09 15:20:25 MDT 2019] Executing as nuin@Paulos-iMac-Pro.local on Mac OS X 10.14.4 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_211-b12; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.20.0-1-gc4dff1c-SNAPSHOT
INFO 2019-05-09 15:20:25 MarkDuplicates Start of doWork freeMemory: 1014138720; totalMemory: 1029177344; maxMemory: 15271460864
INFO 2019-05-09 15:20:25 MarkDuplicates Reading input file and constructing read end information.
INFO 2019-05-09 15:20:25 MarkDuplicates Will retain up to 55331379 data points before spilling to disk.
INFO 2019-05-09 15:20:30 MarkDuplicates Read 1,000,000 records. Elapsed time: 00:00:04s. Time for last 1,000,000: 4s. Last read position: chr9:98,009,629
INFO 2019-05-09 15:20:30 MarkDuplicates Tracking 509 as yet unmatched pairs. 135 records in RAM.
INFO 2019-05-09 15:20:33 MarkDuplicates Read 2,000,000 records. Elapsed time: 00:00:07s. Time for last 1,000,000: 3s. Last read position: chr16:68,863,325
INFO 2019-05-09 15:20:33 MarkDuplicates Tracking 387 as yet unmatched pairs. 52 records in RAM.
INFO 2019-05-09 15:20:34 MarkDuplicates Read 2443004 records. 0 pairs never matched.
INFO 2019-05-09 15:20:36 MarkDuplicates After buildSortedReadEndLists freeMemory: 1980921848; totalMemory: 2719481856; maxMemory: 15271460864
INFO 2019-05-09 15:20:36 MarkDuplicates Will retain up to 477233152 duplicate indices before spilling to disk.
INFO 2019-05-09 15:20:37 MarkDuplicates Traversing read pair information and detecting duplicates.
INFO 2019-05-09 15:20:38 MarkDuplicates Traversing fragment information and detecting duplicates.
INFO 2019-05-09 15:20:38 MarkDuplicates Sorting list of duplicate records.
INFO 2019-05-09 15:20:39 MarkDuplicates After generateDuplicateIndexes freeMemory: 2873101640; totalMemory: 6727663616; maxMemory: 15271460864
INFO 2019-05-09 15:20:39 MarkDuplicates Marking 629356 records as duplicates.
INFO 2019-05-09 15:20:39 MarkDuplicates Found 81 optical duplicate clusters.
INFO 2019-05-09 15:20:39 MarkDuplicates Reads are assumed to be ordered by: coordinate
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0000000141644ea7, pid=10096, tid=0x0000000000002603
#
# JRE version: Java(TM) SE Runtime Environment (8.0_211-b12) (build 1.8.0_211-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C [libgkl_compression2348235168616825397.dylib+0x6ea7] deflate_medium+0x867
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/nuin/src/picard/build/libs/hs_err_pid10096.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Error log is attached. hs_err_pid10096.log
Expected behavior
A BAM file should be generated at the end.
Actual behavior
Truncated file is generated.
Any help appreciated.
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (2 by maintainers)
Top GitHub Comments
@nuin I have resolved the problem by adding the following parameters at the end of the script
USE_JDK_DEFLATER=true USE_JDK_INFLATER=true
Hey @lbergelson,
Just as a side note, I ended up bundling the
picardcloud.jar
in my Dockerfile above (I just added a COPY statement to get the jar file into the container’s filesystem), and that also solved the problem (without the need to addUSE_JDK_DEFLATER=true
USE_JDK_INFLATER=true
).I am just going to use that JAR distribution instead.