Using a protein model#
We use apps to load unaligned DNA sequences and to translate them into amino acids.
from cogent3 import get_app
loader = get_app("load_unaligned", format="fasta")
to_aa = get_app("translate_seqs")
process = loader + to_aa
seqs = process("data/SCA1-cds.fasta")
Protein alignment with default settings#
The default setting for “protein” is a WG01 model.
from cogent3 import get_app
aa_aligner = get_app("progressive_align", "protein")
aligned = aa_aligner(seqs)
aligned
0 | |
Human | MKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH |
Macaque | ........................P......A............................ |
Rat | ........................P.....TA......C...V....ST..S........ |
Mouse | ........................P.....TA......C...V....ST..I........ |
Mouse Lemur | ...............................A.......A..AP................ |
Chimp | ............................................................ |
6 x 825 (truncated to 6 x 60) protein alignment
Specify a different distance measure for estimating the guide tree#
The distance measures available are percent or paralinear.
Note
An estimated guide tree has its branch lengths scaled so they are consistent with usage in a codon model.
aa_aligner = get_app("progressive_align", "protein", distance="paralinear")
aligned = aa_aligner(seqs)
aligned
0 | |
Human | MKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH |
Mouse | ........................P.....TA......C...V....ST..I........ |
Rat | ........................P.....TA......C...V....ST..S........ |
Mouse Lemur | ...............................A.......A..AP................ |
Macaque | ........................P......A............................ |
Chimp | ............................................................ |
6 x 825 (truncated to 6 x 60) protein alignment
Alignment settings and file provenance are recorded in the info
attribute#
aligned.info
{'Refs': {},
'source': 'data/SCA1-cds.fasta',
'align_params': {'indel_length': 0.1,
'indel_rate': 1e-10,
'guide_tree': "((((Mouse:0.011219581285708921,Rat:0.004763355238688913):0.052856143725369786,Mouse_Lemur:0.03580862702845759):0.024351474041303382,Macaque:0.0023127545121458537):0.003074310394311757,(Human:0.0001974014915215993,Chimp:0.008168683695808834)'AUTOGENERATED_NAME_XM':1e-06);",
'model': 'JTT92',
'lnL': -3199.4382635793254}}