Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fatal error on MarkDuplicates

See original GitHub issue

Bug Report

I am getting a fatal error with MarkDuplicates when trying to mark duplicated on a BWA generated BAM file. I have tried with version 2.19.2 installed by home-brew and a freshly built version from GitHub, with identical output. I am still using the old command line syntax.

Affected tool(s)

MarkDuplicates

java -jar picard.jar MarkDuplicates INPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam OUTPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam METRICS_FILE=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt

Affected version(s)

2.19.2 and latest on GitHub

Description

The output is

**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
**********    MarkDuplicates -INPUT /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam -OUTPUT /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam -METRICS_FILE /Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt
**********


15:20:25.548 INFO  NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/nuin/src/picard/build/libs/picard.jar!/com/intel/gkl/native/libgkl_compression.dylib
[Thu May 09 15:20:25 MDT 2019] MarkDuplicates INPUT=[/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.bam] OUTPUT=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.dedup.bam METRICS_FILE=/Users/nuin/Projects/Data/Illumina_small/BAM/NA12877_1/BAM/NA12877_1.metrics.txt    MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=true DUPLEX_UMI=false ADD_PG_TAG_TO_READS=true REMOVE_DUPLICATES=false ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Thu May 09 15:20:25 MDT 2019] Executing as nuin@Paulos-iMac-Pro.local on Mac OS X 10.14.4 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_211-b12; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.20.0-1-gc4dff1c-SNAPSHOT
INFO	2019-05-09 15:20:25	MarkDuplicates	Start of doWork freeMemory: 1014138720; totalMemory: 1029177344; maxMemory: 15271460864
INFO	2019-05-09 15:20:25	MarkDuplicates	Reading input file and constructing read end information.
INFO	2019-05-09 15:20:25	MarkDuplicates	Will retain up to 55331379 data points before spilling to disk.
INFO	2019-05-09 15:20:30	MarkDuplicates	Read     1,000,000 records.  Elapsed time: 00:00:04s.  Time for last 1,000,000:    4s.  Last read position: chr9:98,009,629
INFO	2019-05-09 15:20:30	MarkDuplicates	Tracking 509 as yet unmatched pairs. 135 records in RAM.
INFO	2019-05-09 15:20:33	MarkDuplicates	Read     2,000,000 records.  Elapsed time: 00:00:07s.  Time for last 1,000,000:    3s.  Last read position: chr16:68,863,325
INFO	2019-05-09 15:20:33	MarkDuplicates	Tracking 387 as yet unmatched pairs. 52 records in RAM.
INFO	2019-05-09 15:20:34	MarkDuplicates	Read 2443004 records. 0 pairs never matched.
INFO	2019-05-09 15:20:36	MarkDuplicates	After buildSortedReadEndLists freeMemory: 1980921848; totalMemory: 2719481856; maxMemory: 15271460864
INFO	2019-05-09 15:20:36	MarkDuplicates	Will retain up to 477233152 duplicate indices before spilling to disk.
INFO	2019-05-09 15:20:37	MarkDuplicates	Traversing read pair information and detecting duplicates.
INFO	2019-05-09 15:20:38	MarkDuplicates	Traversing fragment information and detecting duplicates.
INFO	2019-05-09 15:20:38	MarkDuplicates	Sorting list of duplicate records.
INFO	2019-05-09 15:20:39	MarkDuplicates	After generateDuplicateIndexes freeMemory: 2873101640; totalMemory: 6727663616; maxMemory: 15271460864
INFO	2019-05-09 15:20:39	MarkDuplicates	Marking 629356 records as duplicates.
INFO	2019-05-09 15:20:39	MarkDuplicates	Found 81 optical duplicate clusters.
INFO	2019-05-09 15:20:39	MarkDuplicates	Reads are assumed to be ordered by: coordinate
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000141644ea7, pid=10096, tid=0x0000000000002603
#
# JRE version: Java(TM) SE Runtime Environment (8.0_211-b12) (build 1.8.0_211-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  [libgkl_compression2348235168616825397.dylib+0x6ea7]  deflate_medium+0x867
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/nuin/src/picard/build/libs/hs_err_pid10096.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Error log is attached. hs_err_pid10096.log

Expected behavior

A BAM file should be generated at the end.

Actual behavior

Truncated file is generated.

Any help appreciated.

Issue Analytics

State:
Created 4 years ago
Comments:8 (2 by maintainers)

Top GitHub Comments

19reactions

mrnameless123commented, May 24, 2019

@nuin I have resolved the problem by adding the following parameters at the end of the script USE_JDK_DEFLATER=true USE_JDK_INFLATER=true

0reactions

skchroniclescommented, Oct 29, 2020

Hey @lbergelson,

Just as a side note, I ended up bundling the picardcloud.jar in my Dockerfile above (I just added a COPY statement to get the jar file into the container’s filesystem), and that also solved the problem (without the need to add USE_JDK_DEFLATER=true USE_JDK_INFLATER=true).