Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Getting "ValueError: NOT_FOUND: Could not open" when running the pipeline

See original GitHub issue

Hi. I’m running the pipeline on a CRAM file. I read that the pipeline works with CRAM files, so I guess that’s not the issue. Can you assist in any way? Thanks.

Setup

Operating system: Ubuntu 20.04.5 LTS
DeepVariant version: 1.4.0
Installation method: Docker
Type of data: I have no information about the sequencing instrument, the reference genome was GRCh38.

Steps to reproduce:

Command:

BIN_VERSION="1.4.0"

sudo docker run \
  -v "input":"/input" \
  -v "output":"/output" \
  google/deepvariant:"${BIN_VERSION}" \
  /opt/deepvariant/bin/run_deepvariant \
  --model_type=WES \
  --ref=/input/GCF_000001405.26_GRCh38_genomic1.fa.gz \
  --reads=/input/1115492_23181_0_0.cram \
  --regions "chr3:10,049,322-10,156,156" \
  --output_vcf=/output/output.vcf.gz \
  --output_gvcf=/output/output.g.vcf.gz \
  --num_shards=5

Error trace:

parallel: This job failed: sys.exit(main(argv)) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 166, in main options = default_options(add_flags=True, flags_obj=FLAGS) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 128, in default_options samples_in_order, sample_role_to_train = one_sample_from_flags( File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 85, in one_sample_from_flags sample_name = make_examples_core.assign_sample_name( File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py”, line 134, in assign_sample_name with sam.SamReader(reads_filenames.split(‘,’)[0]) as sam_reader: File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/genomics_reader.py”, line 221, in init self._reader = self._native_reader(input_path, **kwargs) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py”, line 260, in _native_reader return NativeSamReader(input_path, **kwargs) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py”, line 227, in init self._reader = sam_reader.SamReader.from_file( ValueError: NOT_FOUND: Could not open /input/1115492_23181_0_0.cram parallel: This job failed: /opt/deepvariant/bin/make_examples --mode calling --ref /input/GCF_000001405.26_GRCh38_genomic1.fa.gz --reads /input/1115492_23181_0_0.cram --examples /tmp/tmpf8z5m2q 0/make_examples.tfrecord@5.gz --channels insert_size --gvcf /tmp/tmpf8z5m2q0/gvcf.tfrecord@5.gz --regions chr3:10,049,322-10,156,156 --task 4

I didn’t try the quick start test.

Issue Analytics

State:
Created 10 months ago
Comments:5

Top GitHub Comments

1reaction

kishwarshafincommented, Nov 22, 2022

@zivlang ,

You need to use the same reference you used to generate the CRAM file using a mapping software like bwa-mem or minimap2. You can see if the command was stored in the header of the cram file. If you want to see if DeepVariant works on your machine, please follow the steps provided in the quick-start here: https://github.com/google/deepvariant/blob/r1.4/docs/deepvariant-quick-start.md

1reaction

kishwarshafincommented, Nov 21, 2022

@zivlang , please change these two lines in your command:

-v "input":"/input" \
-v "output":"/output" \

to:

-v "$PWD/input":"/input" \
-v "$PWD/output":"/output" \

You are missing a $PWD/ and docker bindings require absolute path not relative path, which is causing this issue.

Top Results From Across the Web

django-pipeline throwing ValueError: the file could not be found

Your problem is related to this bug on the Django project. In short, django-pipeline is post-processing the url() calls with Django's ...

Issues - GitHub

Hi, I'm trying to run the tutorial data using the pipeline command on a cluster. I've attached my submission script so you can...

ValueError: Local Secret “MYSECRET” was not found - Usage ...

I created a flow, and on my code I offset the connection with pyodbc.connect. During a quick run, I have 3 errors each...

kedro.pipeline.pipeline — Kedro 0.18.4 documentation

Raises: ValueError: Raised if more than one transcoding separator is present in the name. ... raise ValueError( f"Pipeline does not contain nodes named ......

Troubleshoot pipeline runs - Azure DevOps - Microsoft Learn

Learn how to troubleshoot pipeline runs in Azure Pipelines and Team Foundation Server.