question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Getting "ValueError: NOT_FOUND: Could not open" when running the pipeline

See original GitHub issue

Hi. I’m running the pipeline on a CRAM file. I read that the pipeline works with CRAM files, so I guess that’s not the issue. Can you assist in any way? Thanks.

Setup

  • Operating system: Ubuntu 20.04.5 LTS
  • DeepVariant version: 1.4.0
  • Installation method: Docker
  • Type of data: I have no information about the sequencing instrument, the reference genome was GRCh38.

Steps to reproduce:

  • Command:
BIN_VERSION="1.4.0"

sudo docker run \
  -v "input":"/input" \
  -v "output":"/output" \
  google/deepvariant:"${BIN_VERSION}" \
  /opt/deepvariant/bin/run_deepvariant \
  --model_type=WES \
  --ref=/input/GCF_000001405.26_GRCh38_genomic1.fa.gz \
  --reads=/input/1115492_23181_0_0.cram \
  --regions "chr3:10,049,322-10,156,156" \
  --output_vcf=/output/output.vcf.gz \
  --output_gvcf=/output/output.g.vcf.gz \
  --num_shards=5 
  • Error trace:

parallel: This job failed: sys.exit(main(argv)) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 166, in main options = default_options(add_flags=True, flags_obj=FLAGS) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 128, in default_options samples_in_order, sample_role_to_train = one_sample_from_flags( File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 85, in one_sample_from_flags sample_name = make_examples_core.assign_sample_name( File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py”, line 134, in assign_sample_name with sam.SamReader(reads_filenames.split(‘,’)[0]) as sam_reader: File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/genomics_reader.py”, line 221, in init self._reader = self._native_reader(input_path, **kwargs) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py”, line 260, in _native_reader return NativeSamReader(input_path, **kwargs) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py”, line 227, in init self._reader = sam_reader.SamReader.from_file( ValueError: NOT_FOUND: Could not open /input/1115492_23181_0_0.cram parallel: This job failed: /opt/deepvariant/bin/make_examples --mode calling --ref /input/GCF_000001405.26_GRCh38_genomic1.fa.gz --reads /input/1115492_23181_0_0.cram --examples /tmp/tmpf8z5m2q 0/make_examples.tfrecord@5.gz --channels insert_size --gvcf /tmp/tmpf8z5m2q0/gvcf.tfrecord@5.gz --regions chr3:10,049,322-10,156,156 --task 4

I didn’t try the quick start test.

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
kishwarshafincommented, Nov 22, 2022

@zivlang ,

You need to use the same reference you used to generate the CRAM file using a mapping software like bwa-mem or minimap2. You can see if the command was stored in the header of the cram file. If you want to see if DeepVariant works on your machine, please follow the steps provided in the quick-start here: https://github.com/google/deepvariant/blob/r1.4/docs/deepvariant-quick-start.md

1reaction
kishwarshafincommented, Nov 21, 2022

@zivlang , please change these two lines in your command:

-v "input":"/input" \
-v "output":"/output" \

to:

-v "$PWD/input":"/input" \
-v "$PWD/output":"/output" \

You are missing a $PWD/ and docker bindings require absolute path not relative path, which is causing this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

django-pipeline throwing ValueError: the file could not be found
Your problem is related to this bug on the Django project. In short, django-pipeline is post-processing the url() calls with Django's ...
Read more >
Issues - GitHub
Hi, I'm trying to run the tutorial data using the pipeline command on a cluster. I've attached my submission script so you can...
Read more >
ValueError: Local Secret “MYSECRET” was not found - Usage ...
I created a flow, and on my code I offset the connection with pyodbc.connect. During a quick run, I have 3 errors each...
Read more >
kedro.pipeline.pipeline — Kedro 0.18.4 documentation
Raises: ValueError: Raised if more than one transcoding separator is present in the name. ... raise ValueError( f"Pipeline does not contain nodes named ......
Read more >
Troubleshoot pipeline runs - Azure DevOps - Microsoft Learn
Learn how to troubleshoot pipeline runs in Azure Pipelines and Team Foundation Server.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found