Error if ~harmonize_rca used in GenBank
See original GitHub issueIf the Genbank file has ~harmonize_rca(e_coli => h_sapiens), then running
from dnachisel import DnaOptimizationProblem
problem = DnaOptimizationProblem.from_record("example_sequence.gb")
returns
/path/to/DnaChisel/dnachisel/Specification/FeatureRepresentationMixin.py in from_label(cls, label, location, specifications_dict)
127 "Unknown parameter %s for specification %s "
128 "at location %s"
--> 129 % (faulty_parameter, specification, kwargs["location"])
130 )
131
TypeError: Unknown parameter e_coli for specification harmonize_rca at location 1218-2308(+)
Same error for h_sapiens
or h_sapiens_9606
.
The function works with CodonOptimize(), e.g. CodonOptimize(species='h_sapiens', location=(0, 99), original_species='e_coli')
.
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (11 by maintainers)
Top Results From Across the Web
Validation Error Explanations for Genomes - NCBI - NIH
If the error is from a genome that was created from a fasta submission, then the information comes from the BioSample, which will...
Read more >GenBank Submission Tutorial | Geneious Prime
This error can occur when the organism name specified in the Submit to Genebank window does not match the organism label in the...
Read more >MATLAB getgenbank - MathWorks
This MATLAB function searches for the accession number in the GenBank database and returns Data, a MATLAB structure containing information for the sequence....
Read more >BLAST: Compare & identify sequences - NCBI Bioinformatics ...
The program compares nucleotide or protein sequences and calculates the statistical significance of matches. BLAST can be used to infer ...
Read more >BOLD and GenBank revisited – Do identification errors arise ...
As parameterization of these libraries expands, DNA barcoding has the potential to make valuable contributions in diverse applied contexts.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
That’s not 100% correct, in an idea world it would return
HarmonizeRCA[0-99](e_coli->h_sapiens)
. I think the feature->string conversion could be improved.To be clear, the Genbank API should never use quotes, so you would use
HarmonizeRCA("e_coli->h_sapiens")
in a python script (equivalent toCodonOptimize(species='h_sapiens', original_species='e_coli', method="harmonize_rca"
)). But
HarmonizeRCA(e_coli -> h_sapiens)` in a genbank file.Hmm that’s indeed a bug.
~harmonize_rca(e_coli -> h_sapiens)
should set species toh_sapiens
and origin_species toe_coli
and it looks like it’s failing at that 🤔. I can probably help fix this over the weekend.Sorry for being unclear, I meant
HarmonizeRCA("e_coli => h_sapiens")
should work in a script. The genbank limitation is a bit annoying indeed. Some fields can have multiple lines, but not sure for labels.