Using a protein model#

We use apps to load unaligned DNA sequences and to translate them into amino acids.

from cogent3 import get_app

loader = get_app("load_unaligned", format="fasta")
to_aa = get_app("translate_seqs")
process = loader + to_aa
seqs = process("data/SCA1-cds.fasta")

Protein alignment with default settings#

The default setting for “protein” is a WG01 model.

from cogent3 import get_app

aa_aligner = get_app("progressive_align", "protein")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Chimp............................................................
Mouse Lemur...............................A.......A..AP................
Rat........................P.....TA......C...V....ST..S........
Mouse........................P.....TA......C...V....ST..I........
Macaque........................P......A............................

6 x 825 (truncated to 6 x 60) protein alignment

Specify a different distance measure for estimating the guide tree#

The distance measures available are percent or paralinear.

Note

An estimated guide tree has its branch lengths scaled so they are consistent with usage in a codon model.

aa_aligner = get_app("progressive_align", "protein", distance="paralinear")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Rat........................P.....TA......C...V....ST..S........
Mouse........................P.....TA......C...V....ST..I........
Mouse Lemur...............................A.......A..AP................
Macaque........................P......A............................
Chimp............................................................

6 x 825 (truncated to 6 x 60) protein alignment

Alignment settings and file provenance are recorded in the info attribute#

aligned.info
{'Refs': {},
 'source': 'data/SCA1-cds.fasta',
 'align_params': {'indel_length': 0.1,
  'indel_rate': 1e-10,
  'guide_tree': "(((Rat:0.004763355238688913,Mouse:0.011219581285708921):0.052856143725369786,Mouse_Lemur:0.03580862702845759):0.02435147404130339,(Macaque:0.0023127545121458537,(Human:0.00019740149152159842,Chimp:0.008168683695808834):0.0030743103943117606)'AUTOGENERATED_NAME_mr':1e-06);",
  'model': 'JTT92',
  'lnL': -3208.522219790176}}