Using a protein model

We load the unaligned sequences we will use in our examples and translate them.

from cogent3.app import io, translate

reader = io.load_unaligned(format="fasta")
to_aa = translate.translate_seqs()
process = reader + to_aa
seqs = process("data/SCA1-cds.fasta")

Protein alignment with default settings

The default setting for “protein” is a WG01 model.

from cogent3.app.align import progressive_align

aa_aligner = progressive_align("protein")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Chimp............................................................
Mouse........................P.....TA......C...V....ST..I........
Rat........................P.....TA......C...V....ST..S........
Mouse Lemur...............................A.......A..AP................
Macaque........................P......A............................

6 x 825 (truncated to 6 x 60) protein alignment

Specify a different distance measure for estimating the guide tree

The distance measures available are percent or paralinear.

Note

An estimated guide tree has its branch lengths scaled so they are consistent with usage in a codon model.

aa_aligner = progressive_align("protein", distance="paralinear")
aligned = aa_aligner(seqs)
aligned
0
HumanMKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRH
Chimp............................................................
Rat........................P.....TA......C...V....ST..S........
Mouse........................P.....TA......C...V....ST..I........
Mouse Lemur...............................A.......A..AP................
Macaque........................P......A............................

6 x 825 (truncated to 6 x 60) protein alignment

Alignment settings and file provenance are recorded in the info attribute

aligned.info
{'Refs': {},
 'source': 'data/SCA1-cds.fasta',
 'align_params': {'indel_length': 0.1,
  'indel_rate': 1e-10,
  'guide_tree': '((Human:0.00019740149152159842,Chimp:0.008168683695808834):0.0030743103943117536,((Rat:0.004763355238688913,Mouse:0.011219581285708921):0.052856143725369786,Mouse_Lemur:0.03580862702845759):0.024351474041303396,Macaque:0.0023127545121458606);',
  'model': 'JTT92',
  'lnL': -3199.4388287882207}}