question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

parameters setting for somatic/germline variant calling

See original GitHub issue

Dear team, may I know if there is any different parameters setting between somatic variant calling and germline variant calling? The reason why I posted this question is that I found one heterozygous variant called by deepvariant but seems homozygous supported in IGV. I’m wondering if my setting of deepvariant is too loose for this variant? (ps, I just run deepvariant by default pacbio data setting) Here is the result in VCF. The VAF is quite high, and1,20 AD means only 1 read supported the widetype read? chr4 3079267 . G T 36.1 PASS . GT:GQ:DP:AD:VAF:PL 0/1:4:21:1,20:0.952381:33,0,1 Here is the screenshot of IGV image Thanks!!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
AndrewCarrollcommented, Mar 15, 2022

Hi @Qianwangwoo

We don’t do additional filtering beyond the probabilities from the classifier. In this case, DeepVariant does not have a high confidence in the correct genotype between HET and HOM-ALT (a GQ of 4 corresponds to a ~60% confidence in a correct genotype call). The QUAL value of 36.1 suggests that DeepVariant is at least pretty confident that the position is not REF

A few other points to keep in mind - first, are you using the two-pass DeepVariant-WhatsHap-DeepVariant method? If so, then DeepVariant may be using additional information about the phasing from longer range.

Second, this variant is at a junction between homopolymers (poly-T and poly-G) This represents the dominant error mode for PacBio HiFi, so it may nit be straightforward for a human to assess the probability of a G->T variant here as opposed to a sequencing error of Insertion T and deletion G.

If you want to for sure have a higher precision, you can additionally filter for GQ value (e.g. 10 for a 90% confidence in the genotype call). However, if you do so, you will lose variant positions like this which are very likely not reference, but difficult to genotype.

0reactions
AndrewCarrollcommented, Mar 16, 2022

Hi @Qianwangwoo

Yes, the two-pass method generally improves accuracy with PacBio small variant calling, especially for Indels. Whether it is likely to improve this call, I am not sure. Note that we anticipate a future release of DeepVariant for PacBio in the near future which will have comparable accuracy with a single pass of variant calling, so you may prefer to keep your current workflow and wait for that version if you don’t mind updating.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Best practices for variant calling in clinical sequencing
Population variant filtering is a powerful strategy for identifying and removing likely germline variants from somatic mutation callsets but ...
Read more >
Standards and Guidelines for the Interpretation and Reporting ...
In certain settings, a germline variant may be suspected (eg, MAF 40% to 60%). ... varscan 2 for germline variant calling and somatic...
Read more >
Systematic comparison of somatic variant calling performance ...
In order to evaluate the somatic mutation calling performance at different sequencing depth, we compared precision rate, recall rate and F- ...
Read more >
Identification of somatic and germline variants from tumor and ...
To be able to distinguish between these two types of variants always requires a direct comparison of data from tumor and normal tissue...
Read more >
TOSCA: an automated Tumor Only Somatic CAlling workflow ...
Accurate classification of somatic variants in a tumor sample is often accomplished by utilizing a paired normal tissue sample from the same ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found