question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error due to interval.list instead of interval_list suffix

See original GitHub issue

Error due to interval.list instead of interval_list suffix

Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/360056676692-HS-PENALTY-20X-is-1-on-one-version-of-GATK-crashes-on-newest-version-

When running GATK 4.0.0.0 this works fine but I get a HS_PENALTY_20X of -1

It errors out on GATK v4.1.4.1

I assume a -1 for HS_PENALTY_20X is incorrect?

# CONFIRMING FILES EXIST

=================================

3.8G /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3f_MARK_DUPLICATES_FALSE/19065WBC_fixmate_novosort_dupsrmFalse.bam
54M /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.bait.interval.list
45M /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.target.interval.list

# GATK VERSION

=================================

The Genome Analysis Toolkit (GATK) v4.1.4.1
HTSJDK Version: 2.21.0
Picard Version: 2.21.2
Using GATK jar /data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar --version

# GATK COMMAND

gatk CollectHsMetrics --INPUT /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3f_MARK_DUPLICATES_FALSE/19065WBC_fixmate_novosort_dupsrmFalse.bam --OUTPUT TEMP_NEW/19065WBC_fixmate_novosort_dupsrm.bam_hs_metrics.txt --BAIT_INTERVALS /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.bait.interval.list --TARGET_INTERVALS /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.target.interval.list

=================================

22:29:20.348 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
[Mon Feb 03 22:29:20 EST 2020] CollectHsMetrics --BAIT_INTERVALS /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.bait.interval.list --TARGET_INTERVALS /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.target.interval.list --INPUT /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3f_MARK_DUPLICATES_FALSE/19065WBC_fixmate_novosort_dupsrmFalse.bam --OUTPUT TEMP_NEW/19065WBC_fixmate_novosort_dupsrm.bam_hs_metrics.txt --METRIC_ACCUMULATION_LEVEL ALL_READS --NEAR_DISTANCE 250 --MINIMUM_MAPPING_QUALITY 20 --MINIMUM_BASE_QUALITY 20 --CLIP_OVERLAPPING_READS true --INCLUDE_INDELS false --COVERAGE_CAP 200 --SAMPLE_SIZE 10000 --ALLELE_FRACTION 0.001 --ALLELE_FRACTION 0.005 --ALLELE_FRACTION 0.01 --ALLELE_FRACTION 0.02 --ALLELE_FRACTION 0.05 --ALLELE_FRACTION 0.1 --ALLELE_FRACTION 0.2 --ALLELE_FRACTION 0.3 --ALLELE_FRACTION 0.5 --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
Feb 03, 2020 10:29:20 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
[Mon Feb 03 22:29:20 EST 2020] Executing as nowackj1@ridus004.ind.roche.com on Linux 3.10.0-1062.1.2.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:4.1.4.1
[Mon Feb 03 22:29:20 EST 2020] picard.analysis.directed.CollectHsMetrics done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2972712960
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
htsjdk.samtools.SAMException: Cannot read non-existent file: file:///data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/@HD%09VN:1.4%09SO:unsorted
at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:498)
at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:485)
at picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:115)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
Using GATK jar /data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data1/BIOINFORMATICS/SOFTWARE/ANACONDA_JN/MINI-CONDA/envs/gatk-newest/share/gatk4-4.1.4.1-1/gatk-package-4.1.4.1-local.jar CollectHsMetrics --INPUT /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3f_MARK_DUPLICATES_FALSE/19065WBC_fixmate_novosort_dupsrmFalse.bam --OUTPUT TEMP_NEW/19065WBC_fixmate_novosort_dupsrm.bam_hs_metrics.txt --BAIT_INTERVALS /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.bait.interval.list --TARGET_INTERVALS /data/BIOINFORMATICS/PROJECT_PROD_JN/CAS-0010688777/snakemake-ez-dpops/R3d_INTERVALS/19065WBC_R1.target.interval.list

(created from Zendesk ticket #4552)
gz#4552

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
whalebergcommented, Mar 16, 2020

It’s a barclay problem. We patched barclay to add a warning in the case of an incorrectly labelled interval.list file which should mitigate it. Waiting on a barclay release though.

0reactions
yfarjouncommented, Mar 16, 2020

lol

On Sun, Mar 15, 2020 at 8:03 PM Louis Bergelson notifications@github.com wrote:

Huh, a mysterious stranger with insight into the problem. Lets all forget about whoever that person may be. I’m pretty sure they’re correct in their assessment though…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/picard/issues/1479#issuecomment-599284533, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAU6JUWBSHCGEZKPVRRQLPDRHVUENANCNFSM4K73HULQ .

Read more comments on GitHub >

github_iconTop Results From Across the Web

Intervals and interval lists - GATK - Broad Institute
Interval lists define subsets of genomic regions, sometimes even just individual positions in the genome. You can provide GATK tools with intervals or...
Read more >
Index (htsjdk 2.8.1 API) - Samtools
AbstractLocusIterator(SamReader, IntervalList, boolean) - Constructor for class ... IntervalList. Adds a Collection of intervals to the list of intervals.
Read more >
GATK interval_list file header format and errors - SEQanswers
I have been trying so many formats with different file suffix (.bed .list .intervals .interval_list) but only .bed and .interval_list worked for ...
Read more >
Enhanced suffix arrays implementation and its usage
4.1 An example of lcp-interval tree of string S = acaaacatat|. ... However, a suffix trie can lead to a quadratic memory space...
Read more >
0693. Correct formatting for Interval Lists and other Errors
IMPORTANT: This is the legacy GATK Forum discussions website. This information is only valid until Dec 31st 2019.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found