Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: Data loss: Expected mtid >= 0 as mate is supposedly mapped

See original GitHub issue

I’ve successfully run deepvariant with test data. But I keep getting the following error when extracting pileup images from my own provided BAM file. What could be the problem, please?

I1003 20:27:32.183320 140083390310144 make_examples.py:825] Found 0 candidates in chr1:1-1000 [1000 bp] [1.62s elapsed]
I1003 20:27:32.185085 140083390310144 make_examples.py:825] Found 0 candidates in chr1:1001-2000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.186733 140083390310144 make_examples.py:825] Found 0 candidates in chr1:2001-3000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.188343 140083390310144 make_examples.py:825] Found 0 candidates in chr1:3001-4000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.189908 140083390310144 make_examples.py:825] Found 0 candidates in chr1:4001-5000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.191494 140083390310144 make_examples.py:825] Found 0 candidates in chr1:5001-6000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.193065 140083390310144 make_examples.py:825] Found 0 candidates in chr1:6001-7000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.194626 140083390310144 make_examples.py:825] Found 0 candidates in chr1:7001-8000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.196187 140083390310144 make_examples.py:825] Found 0 candidates in chr1:8001-9000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.197738 140083390310144 make_examples.py:825] Found 0 candidates in chr1:9001-10000 [1000 bp] [0.00s elapsed]
Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 1188, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 1178, in main
    make_examples_runner(options)
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 1090, in make_examples_runner
    candidates, examples, gvcfs = region_processor.process(region)
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 808, in process
    self.in_memory_sam_reader.replace_reads(self.region_reads(region))
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 844, in region_reads
    reads, self.options.max_reads_per_partition, self.random)
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/third_party/nucleus/util/utils.py", line 92, in reservoir_sample
    for i, item in enumerate(iterable):
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/six_archive/six.py", line 558, in next
    return type(self).__next__(self)
  File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/third_party/nucleus/io/clif_postproc.py", line 67, in __next__
    not_done, record = self._cc_iterable.Next()
ValueError: Data loss: Expected mtid >= 0 as mate is supposedly mapped: fragment_name:ValueError: Data loss: Expected mtid >= 0 as mate is supposedly mapped: fragment_name: "XXX00-XX000_000:0:0000:0000:000000/0" read_number: 1 number_reads: 2 alignment { position { reference_name: "chr1" position: 10540 reverse_strand: true } mapping_quality: 60 cigar { operation: ALIGNMENT_MATCH operation_length: 50 } } aligned_sequence: "ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGAT" aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 36 aligned_quality: 36 aligned_quality: 36 aligned_quality: 36 aligned_quality: 35 aligned_quality: 35 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 34 aligned_quality: 34 aligned_quality: 34 next_mate_position { }

Issue Analytics

State:
Created 5 years ago
Comments:38

Top GitHub Comments

1reaction

pgrosucommented, Oct 10, 2018

Super-awesome to hear!!! It was a fun team-effort 😃

1reaction

pgrosucommented, Oct 10, 2018

@zyxue Regarding how it knows about each region, I’m not sure how much do you want to know, since we can go into great detail here. So the big picture of the control flow of the program for make_examples.py is this:

 main() -> 
   make_examples_runner() -> 
      processing_regions_from_options() -> (build_calling_regions, regions_to_process)

The build_calling_regions() first parses out relevant regions to include/exclude by calling a set of Nucleus helper functions to generate the ranges in this file:

https://github.com/google/deepvariant/blob/r0.7/third_party/nucleus/util/ranges.py

Then build_calling_regions() calls regions_to_process() where there is a key line that does a modulo to the number of shards with task_id:

return (r for i, r in enumerate(partitioned) if i % num_shards == task_id)

As you know modulo allows the remainder to be bounded n-1 to the divisor, and thus the distribution of tasks is ideally uniform. Ask more questions, since now you’re getting into Computer Science concepts I know I can easily lose people in the details.

~[p]