Difference between Mean_BAIT_Coverage and Mean_TARGET_coverage over same intervalSee original GitHub issue
When using the tool CollectHsMetrics, the mean coverage values calculated for the bait and target regions are quite different, even though I am using the same intervals file for both. I am using Java 1.8.0 and Picard v2.1.1
Here is the command that I am using (with full paths so you can verify versions)
~/progs/jdk1.8.0_73/bin/java -Xmx4g -jar ~/progs/picard-tools-2.1.1/picard.jar CollectHsMetrics R=/mnt/state_lab/reference/resources/hg19/human_g1k_v37.fasta BAIT_INTERVALS=intersection_intervals.intervals.bed.picard BAIT_SET_NAME="consensus" TARGET_INTERVALS=intersection_intervals.intervals.bed.picard I=$file O=$file.Hs_metrics_picv2
The output that I get is: MEAN_BAIT_COVERAGE MEAN_TARGET_COVERAGE 118.734306 100.990818
I initially thought that maybe the difference is due to the fact that the ‘near bait’ regions are included in this calculation, but if I change this parameter, using NEAR_DISTANCE, the mean bait coverage is unchanged, suggesting this is not the case.
I realize a similar issue was brought up in a previous Picard version (v1.124 #207) but it seems to me this is slightly different because my mean levels are nowhere near that high.
The current Picard documentation does not state this, but in previous versions it looks like mean target coverage was defined as “MEAN_TARGET_COVERAGE: The mean coverage of targets that recieved at least coverage depth = 2 at one base.” This suggests they still may be calculated differently, however, under this definition I would expect mean target coverage to be higher.
I would appreciate any feedback / thoughts on this. Many thanks,
- Created 8 years ago
- Comments:5 (1 by maintainers)
Top GitHub Comments
Hi @ajeremywillsey - the main difference in the current version is that when computing bait coverage very few if any reads are excluded from the calculation, so you can get a true sense of how the wet-lab assay is functioning, but for target coverage a lot of bases that are expected to be of limited utility in variant calling are excluded. Take a look at the description of the various
PCT_EXC metrics to see why bases are being excluded from the target counts. There are parameters to dial most of the filters up and down too.
Definitely resolved. Thanks again.