Applying GNC, a non-stationary codon model#

Note

These docs now use the new_type core objects via the following setting.

import os

# using new types without requiring an explicit argument
os.environ["COGENT3_NEW_TYPE"] = "1"

See Kaehler et al for the formal description of this model. Note that perform hypothesis testing using this model elsewhere.

We apply this to a sample alignment.

from cogent3 import get_app

loader = get_app("load_aligned", format="fasta", moltype="dna")
aln = loader("data/primate_brca1.fasta")

The model is specified using it’s abbreviation.

model = get_app("model", "GNC", tree="data/primate_brca1.tree")
result = model(aln)
result
GNC
keylnLnfpDLCunique_Q
'GNC'-6713.274323TrueTrue
result.lf

GNC

log-likelihood = -6713.2743

number of free parameters = 23

Global params
A>CA>GA>TC>AC>GC>TG>AG>CG>TT>AT>Comega
0.863.540.981.672.206.267.921.230.801.293.070.82
Edge params
edgeparentlength
Galagoroot0.52
HowlerMonroot0.13
Rhesusedge.30.06
Orangutanedge.20.02
Gorillaedge.10.01
Humanedge.00.02
Chimpanzeeedge.00.01
edge.0edge.10.00
edge.1edge.20.01
edge.2edge.30.04
edge.3root0.02
Motif params
AAAAACAAGAATACAACCACGACTAGAAGCAGGAGTATA
0.060.020.030.060.020.000.000.030.020.030.010.040.02
continuation
ATCATGATTCAACACCAGCATCCACCCCCGCCTCGACGC
0.010.010.020.020.010.020.020.020.010.000.030.000.00
continuation
CGGCGTCTACTCCTGCTTGAAGACGAGGATGCAGCCGCG
0.000.000.010.010.010.010.080.010.030.030.020.010.00
continuation
GCTGGAGGCGGGGGTGTAGTCGTGGTTTACTATTCATCC
0.010.020.010.010.010.010.010.010.020.000.010.020.01
continuation
TCGTCTTGCTGGTGTTTATTCTTGTTT
0.000.030.000.000.020.020.010.010.02

We can obtain the tree with branch lengths as ENS#

If this tree is written to newick (using the write() method), the lengths will now be ENS.

tree = result.tree
fig = tree.get_figure()
fig.scale_bar = "top right"
fig.show(width=500, height=500)