Changing feature labels form gbk record
See original GitHub issueHi Zulko,
Love the package! Exactly what I’ve been looking for.
I have a couple of queries about the best way to implement the following things:
Firstly, I’d like to maybe colour each CDS on an individual basis using Matplotlib’s colourscales. How would you advise implementing this? I’ve been using the code below, so I’m thinking I could maybe iterate a list of colours and zip it to the feature list so that each feature is paired with an individual colour? Any better suggestions?
Secondly, is there some way to access additional options for labelling features? In the documented examples, you’ve shown how to forcibly renaming all the CDS to a specific string ("CDS here"
). I’d like to label some/all of my features with the /product
tag from a Genbank, as it’s more informative than the locus tags. Is there some way to access these in the .features
object? I can’t see something that looks like it corresponds to this.
More generally, could I put a request in for some more ‘fully featured’ examples in the documentation that could be deconstructed (if you get the time of course!) ? I’d like to learn how to use the package in much more depth for the future.
Many thanks!
import os, platform, sys
from os.path import expanduser
home = expanduser("~")
import matplotlib
if platform.system() == "Darwin":
matplotlib.use('TkAgg') # Avoid python framework errors on OSX
__author__ = "Joe R. J. Healey"
__version__ = "1.0.0"
__title__ = "PrettyPlotter"
__license__ = "GPLv3"
__author_email__ = "J.R.J.Healey@warwick.ac.uk"
from dna_features_viewer import BiopythonTranslator
class MyCustomTranslator(BiopythonTranslator):
"""Custom translator implementing the following theme:
- Color terminators in green, CDS in blue, all other features in gold.
- Do not display features that are restriction sites unless they are BamHI
- Do not display labels for restriction sites
- For CDS labels just write "CDS here" instead of the name of the gene.
"""
def compute_feature_color(self, feature):
if feature.type == "CDS":
return "blue"
# maybe zip() together iterated features and a list of colours from a colour scale rather than
# having all features the same colour?
elif feature.type == "terminator":
return "green"
else:
return "gold"
def compute_feature_label(self, feature):
if feature.type == "CDS":
return BiopythonTranslator.compute_feature_label(feature.description) # how to get /product description to replace CDS name/locus tag?
def compute_filtered_features(self, features):
"""Do not display promoters. Just because."""
return [feature for feature in features if (feature.type != "gene")]
def parse_args():
"""Parse commandline arguments"""
import argparse
try:
parser = argparse.ArgumentParser(
description='Make pretty images from sequence files (.dna/.gbk/.gff).')
parser.add_argument(
'seqfile',
action='store',
help='Input sequence file. Supported filetypes are genbank, GFF, and SnapGene\'s .dna')
parser.add_argument(
'image',
action='store',
default=home,
help='Output image filename with extension')
return parser.parse_args()
except:
print("An exception occured with argument parsing. Check your provided options.")
sys.exit(1)
def main():
args = parse_args()
if os.path.splitext(args.seqfile)[1] == '.dna':
print("Input file is a SnapGene DNA file. Calling snapgene_reader to convert to BioPython.")
from snapgene_reader import snapgene_file_to_seqrecord
seqrecord = snapgene_file_to_seqrecord(args.infile)
else:
pass
graphic_record = MyCustomTranslator().translate_record(args.seqfile)
ax, _ = graphic_record.plot(figure_width=10)
ax.figure.tight_layout()
ax.figure.savefig(args.image)
if __name__ == '__main__':
main()
Issue Analytics
- State:
- Created 5 years ago
- Comments:16 (8 by maintainers)
Top GitHub Comments
Hey there,
I agree more examples would be a good thing. There are several possible answers to your questions.
First, keep in mind that the feature objects in the Biotranslator refer to Biopython Feature objects. So the Biopython docs will tell you everything about their structure.
Also, if a feature has qualifiers “color” or “label”, these will be used by default by the BioPythonTranslator (unless you have a custom BiopythonTranslator where you overwrite this behavior). That means that instead of putting your logics in your custom translator, you can also pre-process your biopython record before you feed it to the translator:
This being said, here are some ways of doing what you want from the Translator:
For the gene product, I would do it as follows:
For iterating through colors, i would do it as follows:
also have a look at matplotlib.colors for ways of generating color palettes.
Brilliant, that works like a dream! Thanks!