Getting "ValueError: NOT_FOUND: Could not open" when running the pipeline
See original GitHub issueHi. I’m running the pipeline on a CRAM file. I read that the pipeline works with CRAM files, so I guess that’s not the issue. Can you assist in any way? Thanks.
Setup
- Operating system: Ubuntu 20.04.5 LTS
- DeepVariant version: 1.4.0
- Installation method: Docker
- Type of data: I have no information about the sequencing instrument, the reference genome was GRCh38.
Steps to reproduce:
- Command:
BIN_VERSION="1.4.0"
sudo docker run \
-v "input":"/input" \
-v "output":"/output" \
google/deepvariant:"${BIN_VERSION}" \
/opt/deepvariant/bin/run_deepvariant \
--model_type=WES \
--ref=/input/GCF_000001405.26_GRCh38_genomic1.fa.gz \
--reads=/input/1115492_23181_0_0.cram \
--regions "chr3:10,049,322-10,156,156" \
--output_vcf=/output/output.vcf.gz \
--output_gvcf=/output/output.g.vcf.gz \
--num_shards=5
- Error trace:
parallel: This job failed: sys.exit(main(argv)) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 166, in main options = default_options(add_flags=True, flags_obj=FLAGS) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 128, in default_options samples_in_order, sample_role_to_train = one_sample_from_flags( File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples.py”, line 85, in one_sample_from_flags sample_name = make_examples_core.assign_sample_name( File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/deepvariant/make_examples_core.py”, line 134, in assign_sample_name with sam.SamReader(reads_filenames.split(‘,’)[0]) as sam_reader: File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/genomics_reader.py”, line 221, in init self._reader = self._native_reader(input_path, **kwargs) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py”, line 260, in _native_reader return NativeSamReader(input_path, **kwargs) File “/tmp/Bazel.runfiles_ir3xkizo/runfiles/com_google_deepvariant/third_party/nucleus/io/sam.py”, line 227, in init self._reader = sam_reader.SamReader.from_file( ValueError: NOT_FOUND: Could not open /input/1115492_23181_0_0.cram parallel: This job failed: /opt/deepvariant/bin/make_examples --mode calling --ref /input/GCF_000001405.26_GRCh38_genomic1.fa.gz --reads /input/1115492_23181_0_0.cram --examples /tmp/tmpf8z5m2q 0/make_examples.tfrecord@5.gz --channels insert_size --gvcf /tmp/tmpf8z5m2q0/gvcf.tfrecord@5.gz --regions chr3:10,049,322-10,156,156 --task 4
I didn’t try the quick start test.
Issue Analytics
- State:
- Created 10 months ago
- Comments:5
Top GitHub Comments
@zivlang ,
You need to use the same reference you used to generate the CRAM file using a mapping software like bwa-mem or minimap2. You can see if the command was stored in the header of the cram file. If you want to see if DeepVariant works on your machine, please follow the steps provided in the quick-start here: https://github.com/google/deepvariant/blob/r1.4/docs/deepvariant-quick-start.md
@zivlang , please change these two lines in your command:
to:
You are missing a
$PWD/
and docker bindings require absolute path not relative path, which is causing this issue.