GeneticCode#
- class GeneticCode(code_sequence, ID=None, name=None, start_codon_sequence=None)#
Holds codon to amino acid mapping, and vice versa.
Use the get_code() function to get one of the included code instances. These are created as follows.
>>> code_sequence = ( ... "FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG" ... ) >>> gc = GeneticCode(code_sequence) >>> sgc["UUU"] == "F" >>> sgc["TTT"] == "F" >>> sgc["F"] == ["TTT", "TTC"] # in arbitrary order >>> sgc["*"] == ["TAA", "TAG", "TGA"] # in arbitrary order
code_sequence : 64 character string containing NCBI genetic code translation
GeneticCode is immutable once created.
- Attributes:
blocks
Returns list of lists of codon blocks in the genetic code.
Methods
changes
(other)Returns dict of {codon:'XY'} for codons that differ.
get_stop_indices
(dna[, start])returns indexes for stop codons in the specified frame
is_start
(codon)Returns True if codon is a start codon, False otherwise.
is_stop
(codon)Returns True if codon is a stop codon, False otherwise.
sixframes
(dna)Returns six-frame translation as dict containing {frame:translation}
to_regex
(seq)returns a regex pattern with an amino acid expanded to its codon set
to_table
()returns aa to codon mapping as a cogent3 Table
translate
(dna[, start])Translates DNA to protein with current GeneticCode.
get_alphabet
- property blocks#
Returns list of lists of codon blocks in the genetic code.
- A codon block can be:
a quartet, if all 4 XYn codons have the same amino acid.
a doublet, if XYt and XYc or XYa and XYg have the same aa.
a singlet, otherwise.
Returns a list of the quartets, doublets, and singlets in the order UUU -> GGG.
Note that a doublet cannot span the purine/pyrimidine boundary, and a quartet cannot span the boundary between two codon blocks whose first two bases differ.
- changes(other)#
Returns dict of {codon:’XY’} for codons that differ.
X is the string representation of the amino acid in self, Y is the string representation of the amino acid in other. Always returns a 2-character string.
- get_alphabet(include_stop=False)#
- get_stop_indices(dna, start=0)#
returns indexes for stop codons in the specified frame
- is_start(codon)#
Returns True if codon is a start codon, False otherwise.
- is_stop(codon)#
Returns True if codon is a stop codon, False otherwise.
- sixframes(dna)#
Returns six-frame translation as dict containing {frame:translation}
- to_regex(seq)#
returns a regex pattern with an amino acid expanded to its codon set
- Parameters:
- seq
a Sequence or string of amino acids
- to_table()#
returns aa to codon mapping as a cogent3 Table
- translate(dna, start=0)#
Translates DNA to protein with current GeneticCode.
- Parameters:
- dna: str
a string of nucleotides
- start: int
position to begin translation (used to implement frames)
- Returns:
- String containing amino acid sequence. Translates the entire sequence.
- It is the caller’s responsibility to find open reading frames.