Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Difference between Mean_BAIT_Coverage and Mean_TARGET_coverage over same interval

See original GitHub issue

Hello,

When using the tool CollectHsMetrics, the mean coverage values calculated for the bait and target regions are quite different, even though I am using the same intervals file for both. I am using Java 1.8.0 and Picard v2.1.1

Here is the command that I am using (with full paths so you can verify versions)

~/progs/jdk1.8.0_73/bin/java -Xmx4g -jar ~/progs/picard-tools-2.1.1/picard.jar CollectHsMetrics R=/mnt/state_lab/reference/resources/hg19/human_g1k_v37.fasta BAIT_INTERVALS=intersection_intervals.intervals.bed.picard BAIT_SET_NAME="consensus" TARGET_INTERVALS=intersection_intervals.intervals.bed.picard I=$file O=$file.Hs_metrics_picv2

The output that I get is: MEAN_BAIT_COVERAGE MEAN_TARGET_COVERAGE 118.734306 100.990818

I initially thought that maybe the difference is due to the fact that the ‘near bait’ regions are included in this calculation, but if I change this parameter, using NEAR_DISTANCE, the mean bait coverage is unchanged, suggesting this is not the case.

I realize a similar issue was brought up in a previous Picard version (v1.124 #207) but it seems to me this is slightly different because my mean levels are nowhere near that high.

The current Picard documentation does not state this, but in previous versions it looks like mean target coverage was defined as “MEAN_TARGET_COVERAGE: The mean coverage of targets that recieved at least coverage depth = 2 at one base.” This suggests they still may be calculated differently, however, under this definition I would expect mean target coverage to be higher.

I would appreciate any feedback / thoughts on this. Many thanks,

Issue Analytics

State:
Created 8 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

2reactions

tfennecommented, Mar 9, 2016

Hi @ajeremywillsey - the main difference in the current version is that when computing bait coverage very few if any reads are excluded from the calculation, so you can get a true sense of how the wet-lab assay is functioning, but for target coverage a lot of bases that are expected to be of limited utility in variant calling are excluded. Take a look at the description of the various PCT_EXC metrics to see why bases are being excluded from the target counts. There are parameters to dial most of the filters up and down too.

0reactions

ajeremywillseycommented, Mar 10, 2016

Definitely resolved. Thanks again.

Top Results From Across the Web

Difference between bait and target region - SEQanswers

I think the difference is that the mean target coverage is only calculated for targets where at least one base has a coverage...

Confusing Picard HsMetrics ON_BAIT_VS_SELECTED for ...

The bait and Target interval file was the same. With the default value of NEAR_DISTANCE=250, I got ON_BAIT_VS_SELECTED value around 0.45 for ......

Combined tumor and immune signals from genomes or ... - NCBI

We found similar patterns for genes differentially expressed between responders and ... We collected mean target coverage (MTC) and mean bait coverage (MBC) ......

Efficient identification of somatic mutations in acute myeloid ...

Most studies focus on identifying somatic mutations in the protein coding portion of the genome using whole exome sequencing (WES). Every human ...

qt8wf5x99q_noSplash_9c93dda...

20]. In sum, a line of pathologically “nervous” pointer dogs was developed through selective breeding, and compared to a control line of the...