question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error if ~harmonize_rca used in GenBank

See original GitHub issue

If the Genbank file has ~harmonize_rca(e_coli => h_sapiens), then running

from dnachisel import DnaOptimizationProblem
problem = DnaOptimizationProblem.from_record("example_sequence.gb")

returns

/path/to/DnaChisel/dnachisel/Specification/FeatureRepresentationMixin.py in from_label(cls, label, location, specifications_dict)
    127                 "Unknown parameter %s for specification %s "
    128                 "at location %s"
--> 129                 % (faulty_parameter, specification, kwargs["location"])
    130             )
    131 

TypeError: Unknown parameter e_coli  for specification harmonize_rca at location 1218-2308(+)

Same error for h_sapiens or h_sapiens_9606.

The function works with CodonOptimize(), e.g. CodonOptimize(species='h_sapiens', location=(0, 99), original_species='e_coli').

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
Zulkocommented, Jun 10, 2021

CodonOptimize(species='h_sapiens', location=(0, 99), original_species='e_coli', method="harmonize_rca") returns HarmonizeRCA[0-99](h_sapiens) which is correct.

That’s not 100% correct, in an idea world it would return HarmonizeRCA[0-99](e_coli->h_sapiens). I think the feature->string conversion could be improved.

Indeed Snapgene Viewer saves a quoted annotation with 2x2 quote; for example /label=~harmonize_rca(““e_coli => h_sapiens””), so I recommend implementing this feature without quotes.

To be clear, the Genbank API should never use quotes, so you would use HarmonizeRCA("e_coli->h_sapiens") in a python script (equivalent to CodonOptimize(species='h_sapiens', original_species='e_coli', method="harmonize_rca")). But HarmonizeRCA(e_coli -> h_sapiens)` in a genbank file.

However, if I replace with -> in the code and use the attached example Genbank file, then it optimizes for E. coli:

Hmm that’s indeed a bug. ~harmonize_rca(e_coli -> h_sapiens) should set species to h_sapiens and origin_species to e_coli and it looks like it’s failing at that 🤔. I can probably help fix this over the weekend.

1reaction
Zulkocommented, Apr 9, 2021

Sorry for being unclear, I meant HarmonizeRCA("e_coli => h_sapiens") should work in a script. The genbank limitation is a bit annoying indeed. Some fields can have multiple lines, but not sure for labels.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Validation Error Explanations for Genomes - NCBI - NIH
If the error is from a genome that was created from a fasta submission, then the information comes from the BioSample, which will...
Read more >
GenBank Submission Tutorial | Geneious Prime
This error can occur when the organism name specified in the Submit to Genebank window does not match the organism label in the...
Read more >
MATLAB getgenbank - MathWorks
This MATLAB function searches for the accession number in the GenBank database and returns Data, a MATLAB structure containing information for the sequence....
Read more >
BLAST: Compare & identify sequences - NCBI Bioinformatics ...
The program compares nucleotide or protein sequences and calculates the statistical significance of matches. BLAST can be used to infer ...
Read more >
BOLD and GenBank revisited – Do identification errors arise ...
As parameterization of these libraries expands, DNA barcoding has the potential to make valuable contributions in diverse applied contexts.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found