ValueError: Data loss: Expected mtid >= 0 as mate is supposedly mapped
See original GitHub issueI’ve successfully run deepvariant with test data. But I keep getting the following error when extracting pileup images from my own provided BAM file. What could be the problem, please?
I1003 20:27:32.183320 140083390310144 make_examples.py:825] Found 0 candidates in chr1:1-1000 [1000 bp] [1.62s elapsed]
I1003 20:27:32.185085 140083390310144 make_examples.py:825] Found 0 candidates in chr1:1001-2000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.186733 140083390310144 make_examples.py:825] Found 0 candidates in chr1:2001-3000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.188343 140083390310144 make_examples.py:825] Found 0 candidates in chr1:3001-4000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.189908 140083390310144 make_examples.py:825] Found 0 candidates in chr1:4001-5000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.191494 140083390310144 make_examples.py:825] Found 0 candidates in chr1:5001-6000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.193065 140083390310144 make_examples.py:825] Found 0 candidates in chr1:6001-7000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.194626 140083390310144 make_examples.py:825] Found 0 candidates in chr1:7001-8000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.196187 140083390310144 make_examples.py:825] Found 0 candidates in chr1:8001-9000 [1000 bp] [0.00s elapsed]
I1003 20:27:32.197738 140083390310144 make_examples.py:825] Found 0 candidates in chr1:9001-10000 [1000 bp] [0.00s elapsed]
Traceback (most recent call last):
File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 1188, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 1178, in main
make_examples_runner(options)
File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 1090, in make_examples_runner
candidates, examples, gvcfs = region_processor.process(region)
File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 808, in process
self.in_memory_sam_reader.replace_reads(self.region_reads(region))
File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/deepvariant/make_examples.py", line 844, in region_reads
reads, self.options.max_reads_per_partition, self.random)
File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/third_party/nucleus/util/utils.py", line 92, in reservoir_sample
for i, item in enumerate(iterable):
File "/tmp/Bazel.runfiles_8StCi1/runfiles/six_archive/six.py", line 558, in next
return type(self).__next__(self)
File "/tmp/Bazel.runfiles_8StCi1/runfiles/com_google_deepvariant/third_party/nucleus/io/clif_postproc.py", line 67, in __next__
not_done, record = self._cc_iterable.Next()
ValueError: Data loss: Expected mtid >= 0 as mate is supposedly mapped: fragment_name:ValueError: Data loss: Expected mtid >= 0 as mate is supposedly mapped: fragment_name: "XXX00-XX000_000:0:0000:0000:000000/0" read_number: 1 number_reads: 2 alignment { position { reference_name: "chr1" position: 10540 reverse_strand: true } mapping_quality: 60 cigar { operation: ALIGNMENT_MATCH operation_length: 50 } } aligned_sequence: "ATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGAT" aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 35 aligned_quality: 36 aligned_quality: 36 aligned_quality: 36 aligned_quality: 36 aligned_quality: 35 aligned_quality: 35 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 41 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 39 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 37 aligned_quality: 34 aligned_quality: 34 aligned_quality: 34 next_mate_position { }
Issue Analytics
- State:
- Created 5 years ago
- Comments:38
Top Results From Across the Web
How to Enable the GPU in local Docker run · Issue #81 - GitHub
N/A 29C P0 29W / 250W | 0MiB / 16280MiB | 0% Default | ... ValueError: Data loss: Expected mtid >= 0 as...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Super-awesome to hear!!! It was a fun team-effort 😃
@zyxue Regarding how it knows about each region, I’m not sure how much do you want to know, since we can go into great detail here. So the big picture of the control flow of the program for
make_examples.py
is this:The
build_calling_regions()
first parses out relevant regions to include/exclude by calling a set of Nucleus helper functions to generate the ranges in this file:https://github.com/google/deepvariant/blob/r0.7/third_party/nucleus/util/ranges.py
Then
build_calling_regions()
callsregions_to_process()
where there is a key line that does a modulo to the number of shards with task_id:As you know modulo allows the remainder to be bounded n-1 to the divisor, and thus the distribution of tasks is ideally uniform. Ask more questions, since now you’re getting into Computer Science concepts I know I can easily lose people in the details.
~[p]