Using a protein model#

We use apps to load unaligned DNA sequences and to translate them into amino acids.

from cogent3 import get_app

loader = get_app("load_unaligned", format="fasta")
to_aa = get_app("translate_seqs")
process = loader + to_aa
seqs = process("data/SCA1-cds.fasta")

Protein alignment with default settings#

The default setting for “protein” is a WG01 model.

from cogent3 import get_app

aa_aligner = get_app("progressive_align", "protein")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Mouse Lemur...............................A.......A..AP................
Mouse........................P.....TA......C...V....ST..I........
Rat........................P.....TA......C...V....ST..S........
Chimp............................................................
Macaque........................P......A............................

6 x 825 (truncated to 6 x 60) protein alignment

Specify a different distance measure for estimating the guide tree#

The distance measures available are percent or paralinear.

Note

An estimated guide tree has its branch lengths scaled so they are consistent with usage in a codon model.

aa_aligner = get_app("progressive_align", "protein", distance="paralinear")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Mouse........................P.....TA......C...V....ST..I........
Rat........................P.....TA......C...V....ST..S........
Mouse Lemur...............................A.......A..AP................
Chimp............................................................
Macaque........................P......A............................

6 x 825 (truncated to 6 x 60) protein alignment

Alignment settings and file provenance are recorded in the info attribute#

aligned.info
{'Refs': {},
 'source': 'data/SCA1-cds.fasta',
 'align_params': {'indel_length': 0.1,
  'indel_rate': 1e-10,
  'guide_tree': "(((Mouse:0.011219581285708921,Rat:0.004763355238688913):0.052856143725369786,Mouse_Lemur:0.03580862702845759):0.02435147404130339,((Chimp:0.008168683695808834,Human:0.00019740149152159842):0.0030743103943117606,Macaque:0.0023127545121458537)'AUTOGENERATED_NAME_Wa':1e-06);",
  'model': 'JTT92',
  'lnL': -3208.522219790176}}