Applying a time-reversible codon model#

Note

These docs now use the new_type core objects via the following setting.

import os

# using new types without requiring an explicit argument
os.environ["COGENT3_NEW_TYPE"] = "1"

We display the full set of codon models available.

from cogent3 import available_models

available_models("codon")
Specify a model using 'Abbreviation' (case sensitive).
Model TypeAbbreviationDescription
codonCNFGTRConditional nucleotide frequency codon substitution model, GTR variant (with params analagous to the nucleotide GTR model). Yap, Lindsay, Easteal and Huttley, 2010, Mol Biol Evol 27: 726-734
codonCNFHKYConditional nucleotide frequency codon substitution model, HKY variant (with kappa, the ratio of transitions to transversions) Yap, Lindsay, Easteal and Huttley, 2010, Mol Biol Evol 27: 726-734
codonMG94HKYMuse and Gaut 1994 codon substitution model, HKY variant (with kappa, the ratio of transitions to transversions) Muse and Gaut, 1994, Mol Biol Evol, 11, 715-24
codonMG94GTRMuse and Gaut 1994 codon substitution model, GTR variant (with params analagous to the nucleotide GTR model) Muse and Gaut, 1994, Mol Biol Evol, 11, 715-24
codonGY94Goldman and Yang 1994 codon substitution model. N Goldman and Z Yang, 1994, Mol Biol Evol, 11(5):725-36.
codonY98Yang's 1998 substitution model, a derivative of the GY94. Z Yang, 1998, Mol Biol Evol, 15(5):568-73
codonH04GHuttley 2004 CpG substitution model. Includes a term for substitutions to or from CpG's. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8
codonH04GKHuttley 2004 CpG substitution model. Includes a term for transition substitutions to or from CpG's. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8
codonH04GGKHuttley 2004 CpG substitution model. Includes a general term for substitutions to or from CpG's and an adjustment for CpG transitions. GA Huttley, 2004, Mol Biol Evol, 21(9):1760-8
codonGNCGeneral Nucleotide Codon, a non-reversible codon model. Kaehler, Yap, Huttley, 2017, Gen Biol Evol 9(1): 134–49

10 rows x 3 columns

Using the conditional nucleotide form codon model#

The CNFGTR model (Yap et al) is the most robust of the time-reversible codon models available (Kaehler et al). By default, this model does not optimise the codon frequencies but uses the average estimated from the alignment. We configure the model to optimise the root motif probabilities.

from cogent3 import get_app

loader = get_app("load_aligned", format="fasta", moltype="dna")
aln = loader("data/primate_brca1.fasta")
model = get_app("model",
    "CNFGTR",
    tree="data/primate_brca1.tree",
    optimise_motif_probs=True,
)
result = model(aln)
result
CNFGTR
keylnLnfpDLCunique_Q
'CNFGTR'-6739.307677TrueTrue
result.lf

CNFGTR

log-likelihood = -6739.3076

number of free parameters = 77

Global params
A/CA/GA/TC/GC/Tomega
1.073.940.791.954.230.76
Edge params
edgeparentlength
Galagoroot0.53
HowlerMonroot0.14
Rhesusedge.30.07
Orangutanedge.20.02
Gorillaedge.10.01
Humanedge.00.02
Chimpanzeeedge.00.01
edge.0edge.10.00
edge.1edge.20.01
edge.2edge.30.04
edge.3root0.02
Motif params
AAAAACAAGAATACAACCACGACTAGAAGCAGGAGTATA
0.050.020.030.050.020.010.000.030.020.030.010.040.02
continuation
ATCATGATTCAACACCAGCATCCACCCCCGCCTCGACGC
0.010.010.020.020.010.020.020.020.000.000.020.000.00
continuation
CGGCGTCTACTCCTGCTTGAAGACGAGGATGCAGCCGCG
0.000.010.010.010.010.010.070.010.030.030.020.010.00
continuation
GCTGGAGGCGGGGGTGTAGTCGTGGTTTACTATTCATCC
0.010.020.010.010.010.020.010.010.020.000.020.020.01
continuation
TCGTCTTGCTGGTGTTTATTCTTGTTT
0.000.020.000.000.020.020.010.010.01