Building phylogenies#
Building A Phylogenetic Tree From Pairwise Distances#
Directly via alignment.quick_tree()
#
Both the ArrayAlignment
and Alignment
classes support this.
from cogent3 import load_aligned_seqs
aln = load_aligned_seqs("data/primate_brca1.fasta", moltype="dna")
tree = aln.quick_tree(calc="TN93")
tree = tree.balanced() # purely for display
print(tree.ascii_art())
/-Orangutan
|
| /-Galago
| /edge.1--|
|-edge.2--| \-HowlerMon
-root----| |
| \-Rhesus
|
| /-Gorilla
| /edge.0--|
\edge.3--| \-Human
|
\-Chimpanzee
Using the DistanceMatrix
object#
from cogent3 import load_aligned_seqs
aln = load_aligned_seqs("data/primate_brca1.fasta", moltype="dna")
dists = aln.distance_matrix(calc="TN93")
tree = dists.quick_tree()
tree = tree.balanced() # purely for display
print(tree.ascii_art())
/-Orangutan
|
| /-Galago
| /edge.0--|
-root----|-edge.1--| \-HowlerMon
| |
| \-Rhesus
|
| /-Chimpanzee
\edge.2--|
| /-Human
\edge.3--|
\-Gorilla
Explicitly via DistanceMatrix
and cogent3.phylo.nj.nj()`
#
from cogent3 import load_aligned_seqs
from cogent3.phylo import nj
aln = load_aligned_seqs("data/primate_brca1.fasta", moltype="dna")
dists = aln.distance_matrix(calc="TN93")
tree = nj.nj(dists, show_progress=False)
tree = tree.balanced() # purely for display
print(tree.ascii_art())
/-Orangutan
|
| /-Galago
| /edge.1--|
|-edge.2--| \-HowlerMon
-root----| |
| \-Rhesus
|
| /-Gorilla
| /edge.0--|
\edge.3--| \-Human
|
\-Chimpanzee
Directly from a pairwise distance dict
#
from cogent3.phylo import nj
dists = {("a", "b"): 2.7, ("c", "b"): 2.33, ("c", "a"): 0.73}
tree = nj.nj(dists, show_progress=False)
print(tree.ascii_art())
/-c
|
-root----|--a
|
\-b
By Least-squares#
We illustrate the phylogeny reconstruction by least-squares using the F81 substitution model. We use the advanced-stepwise addition algorithm to search tree space. Here a
is the number of taxa to exhaustively evaluate all possible phylogenies for. Successive taxa are added to the top k
trees (measured by the least-squares metric) and k
trees are kept at each iteration.
from cogent3.phylo.least_squares import WLS
from cogent3.util.deserialise import deserialise_object
dists = deserialise_object("data/dists_for_phylo.json")
ls = WLS(dists)
stat, tree = ls.trex(a=5, k=5, show_progress=False)
Other optional arguments that can be passed to the trex
method are: return_all
, whether the k
best trees at the final step are returned as a ScoredTreeCollection
object; order
, a series of tip names whose order defines the sequence in which tips will be added during tree building (this allows the user to randomise the input order).
By ML#
We illustrate the phylogeny reconstruction using maximum-likelihood using the F81 substitution model. We use the advanced-stepwise addition algorithm to search tree space.
from cogent3 import load_aligned_seqs
from cogent3.evolve.models import F81
from cogent3.phylo.maximum_likelihood import ML
aln = load_aligned_seqs("data/primate_brca1.fasta", moltype="dna")
ml = ML(F81(), aln)