Using a protein model#

We use apps to load unaligned DNA sequences and to translate them into amino acids.

from cogent3 import get_app

loader = get_app("load_unaligned", format="fasta")
to_aa = get_app("translate_seqs")
process = loader + to_aa
seqs = process("data/SCA1-cds.fasta")

Protein alignment with default settings#

The default setting for “protein” is a WG01 model.

from cogent3 import get_app

aa_aligner = get_app("progressive_align", "protein")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Macaque........................P......A............................
Rat........................P.....TA......C...V....ST..S........
Mouse........................P.....TA......C...V....ST..I........
Mouse Lemur...............................A.......A..AP................
Chimp............................................................

6 x 825 (truncated to 6 x 60) protein alignment

Specify a different distance measure for estimating the guide tree#

The distance measures available are percent or paralinear.

Note

An estimated guide tree has its branch lengths scaled so they are consistent with usage in a codon model.

aa_aligner = get_app("progressive_align", "protein", distance="paralinear")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Mouse........................P.....TA......C...V....ST..I........
Rat........................P.....TA......C...V....ST..S........
Mouse Lemur...............................A.......A..AP................
Macaque........................P......A............................
Chimp............................................................

6 x 825 (truncated to 6 x 60) protein alignment

Alignment settings and file provenance are recorded in the info attribute#

aligned.info
{'Refs': {},
 'source': 'data/SCA1-cds.fasta',
 'align_params': {'indel_length': 0.1,
  'indel_rate': 1e-10,
  'guide_tree': "((((Mouse:0.011219581285708921,Rat:0.004763355238688913):0.052856143725369786,Mouse_Lemur:0.03580862702845759):0.024351474041303382,Macaque:0.0023127545121458537):0.003074310394311757,(Human:0.0001974014915215993,Chimp:0.008168683695808834)'AUTOGENERATED_NAME_XM':1e-06);",
  'model': 'JTT92',
  'lnL': -3199.4382635793254}}