question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Updated Cyvcf2 Adds MNPs (Multiple Nucleotide Polymorphism) -Will Break Tests

See original GitHub issue

Somewhere between our locked version of Cyvcf2 (<0.10.0) and the latest version, support for MNPs was added. Displayed below is the result of basic content search.

Newest

cyvcf2% grep -c mnp *
__init__.py:0
__main__.py:0
grep: __pycache__: Is a directory
cli.py:0
cyvcf2.c:36
cyvcf2.cpython-39-darwin.so:3
cyvcf2.pxd:0
cyvcf2.pyx:3
helpers.c:0
helpers.h:0
relatedness.h:0
grep: tests: Is a directory

<0.10.0

cyvcf2% grep -c mnp *
__init__.py:0
__main__.py:0
grep: __pycache__: Is a directory
cli.py:0
cyvcf2.c:0
cyvcf2.cpython-38-darwin.so:0
cyvcf2.pxd:0
cyvcf2.pyx:0
helpers.c:0
helpers.h:0
relatedness.h:0
grep: tests: Is a directory

Running the test suite will eventually break on test tests/adapter/mongo/test_query.py::test_query_snvs_by_coordinates when calling the constant CALLERS with argument mnp. This because variant.var_type which previous was set to SNP or SNV will now may be set to MNP. I have come up with two simple solutions, but unsure which one is correct or best approach.

A. Add Support for MNP

Add support for mnp:s in CALLERS constant. /constants/__init__.py. –I am unsure how this should look.

CALLERS = {
    "snv": [
        {"id": "gatk", "name": "GATK"},
        {"id": "freebayes", "name": "Freebayes"},
        {"id": "samtools", "name": "SAMtools"},
        {"id": "bcftools", "name": "Bcftools"},
        {"id": "deepvariant", "name": "DeepVariant"},
    ],
    
    "cancer": [
        {"id": "mutect", "name": "MuTect"},
        {"id": "pindel", "name": "Pindel"},
        {"id": "gatk", "name": "GATK"},
        {"id": "freebayes", "name": "Freebayes"},
    ],
    "cancer_sv": [
        {"id": "manta", "name": "Manta"},
        {"id": "gatk", "name": "GATK"},
    ],
    "sv": [
        {"id": "gatk", "name": "GATK"},
        {"id": "cnvnator", "name": "CNVnator"},
        {"id": "delly", "name": "Delly"},
        {"id": "tiddit", "name": "TIDDIT"},
        {"id": "manta", "name": "Manta"},
    ],
    "str": [{"id": "expansionhunter", "name": "ExpansionHunter"}],
}

B. Cast MNP:s to SNPs

To my understanding a MNP is a set of SNVs. Can they be treated as equivalent? In parser/variant/variant.py add the following:

    if not category:
        category = variant.var_type
        if category == "indel":
            category = "snv"
        if category == "snp":
            category = "snv"
        if category == "mnp":
            category = "snv"

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
mikaellcommented, Jan 18, 2022

It will be a bug in the moment we de-freeze the CyVCF2 lib, it’s not a bug at the moment, right @mikaell ?

No, not at the moment.

0reactions
northwestwitchcommented, Jan 27, 2022

This one was fixed by #3114. Closing!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Multiple nucleotide polymorphism DNA markers for ... - bioRxiv
(A) The development of MNP markers involves two steps, first to identify genomic loci of candidate MNPs that have diverse alleles across plant ......
Read more >
variant analysis pipeline: Topics by Science.gov
We hope it will now prove useful to many medical sequencing studies. ... and consequences to be ascribed to complex multi-nucleotide polymorphisms (MNPs), ......
Read more >
Focus on Multi-Nucleotide Variants in Coding Regions and ...
These data enable us to call genetic variations by spotting differences between aligned reads and the species reference genome or among aligned ...
Read more >
Unified Representation of Genetic Variants | Request PDF
A genetic variant can be represented in the Variant Call Format ... It can find SNPs, indels and multi nucleotide polymorphisms (MNPs) in ......
Read more >
Supported Applications - BioGrids Consortium
Tasks can be easily split by chromosome for distributing whole-genome ... is the latest version of Clustal: a multiple sequence alignment program for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found