PhyloNode#

class PhyloNode(*args, **kwargs)#
Attributes:
length
parent

Accessor for parent.

Methods

ancestors()

Returns all ancestors back to the root.

append(i)

Appends i to self.children, in-place, cleaning up refs.

ascii_art([show_internal, compact])

Returns a string containing an ascii drawing of the tree.

balanced()

Tree 'rooted' here with no neighbour having > 50% of the edges.

bifurcating([eps, constructor, name_unnamed])

Wrap multifurcating with a num of 2

child_groups()

Returns list containing lists of children sharing a state.

child_parent_map()

return dict of {<child name>: <parent name>, ...}

compare_by_names(other)

Equality test for trees by name

compare_by_subsets(other[, exclude_absent_taxa])

Returns fraction of overlapping subsets where self and other differ.

compare_by_tip_distances(other[, sample, ...])

Compares self to other using tip-to-tip distance matrices.

compare_name(other)

Compares TreeNode by name

copy([memo, _nil, constructor])

Returns a copy of self using an iterative approach

copy_topology([constructor])

Copies only the topology and labels of a tree, not any extra data.

deepcopy([memo, _nil, constructor])

Returns a copy of self using an iterative approach

descendant_array([tip_list])

Returns numpy array with nodes in rows and descendants in columns.

distance(other)

Returns branch length between self and other.

extend(items)

Extends self.children by items, in-place, cleaning up refs.

get_connecting_edges(name1, name2)

returns a list of edges connecting two nodes.

get_connecting_node(name1, name2)

Finds the last common ancestor of the two named edges.

get_distances([endpoints])

The distance matrix as a dictionary.

get_edge_names(tip1name, tip2name[, clade, ...])

Return the list of stem and/or sub tree (clade) edge name(s).

get_edge_vector([include_root])

Collect the list of edges in postfix order

get_figure([style])

gets Dendrogram for plotting the phylogeny

get_max_tip_tip_distance()

Returns the max tip-to-tip distance between any pair of tips

get_newick([with_distances, semicolon, ...])

Return the newick string of node and its descendents

get_node_names([includeself, tipsonly])

Return a list of edges from this edge - may or may not include self.

get_nodes_dict()

Returns a dict keyed by node name, value is node

get_param_value(param, edge)

returns the parameter value for named edge

get_sub_tree(name_list[, ignore_missing, ...])

A new instance of a sub tree that contains all the otus that are listed in name_list.

get_tip_names([includeself])

return the list of the names of all tips contained by this edge

get_xml()

Return XML formatted tree string.

index_in_parent()

Returns index of self in parent.

insert(index, i)

Inserts an item at specified position in self.children.

is_root()

Returns True if the current is a root, i.e. has no parent.

is_tip()

Returns True if the current node is a tip, i.e. has no children.

isroot()

Returns True if root of a tree, i.e. no parent.

istip()

Returns True if is tip, i.e. no children.

iter_nontips([include_self])

Iterates over nontips descended from self, [] if none.

iter_tips([include_self])

Iterates over tips descended from self, [] if self is a tip.

last_common_ancestor(other)

Finds last common ancestor of self and other, or None.

lca(other)

Finds last common ancestor of self and other, or None.

levelorder([include_self])

Performs levelorder iteration over tree

lin_rajan_moret(tree2)

return the lin-rajan-moret distance between trees

lowest_common_ancestor(tipnames)

Lowest common ancestor for a list of tipnames

make_tree_array([dec_list])

Makes an array with nodes in rows and descendants in columns.

max_tip_tip_distance()

returns the max distance between any pair of tips

multifurcating(num[, eps, constructor, ...])

return a new tree with every node having num or few children

name_unnamed_nodes()

sets the Data property of unnamed nodes to an arbitrary value

non_tip_children()

Returns direct children in self that have descendants.

nontips([include_self])

Returns nontips descended from self.

pop([index])

Returns and deletes child of self at index (default: -1)

postorder([include_self])

performs postorder iteration over tree.

pre_and_postorder([include_self])

Performs iteration over tree, visiting node before and after.

preorder([include_self])

Performs preorder iteration over tree.

prune()

Reconstructs correct tree after nodes have been removed.

reassign_names(mapping[, nodes])

Reassigns node names based on a mapping dict

remove(target)

Removes node by name instead of identity.

remove_deleted(is_deleted)

Removes all nodes where is_deleted tests true.

remove_node(target)

Removes node by identity instead of value.

root()

Returns root of the tree self is in.

root_at_midpoint()

return a new tree rooted at midpoint of the two tips farthest apart

rooted_at(edge_name)

Return a new tree rooted at the provided node.

rooted_with_tip(outgroup_name)

A new tree with the named tip as one of the root's children

same_shape(other)

Ignores lengths and order, so trees should be sorted first

same_topology(other)

Tests whether two trees have the same topology.

scale_branch_lengths([max_length, ultrametric])

Scales BranchLengths in place to integers for ascii output.

separation(other)

Returns number of edges separating self and other.

set_max_tip_tip_distance()

Propagate tip distance information up the tree

set_param_value(param, edge, value)

set's the value for param at named edge

set_tip_distances()

Sets distance from each node to the most distant tip.

siblings()

Returns all nodes that are children of the same parent as self.

sorted([sort_order])

An equivalent tree sorted into a standard order.

subset()

Returns set of names that descend from specified node

subsets()

Returns all sets of names that come from specified node and its kids

tip_children()

Returns direct children of self that are tips.

tip_to_tip_distances([endpoints, default_length])

Returns distance matrix between all pairs of tips, and a tip order.

tips([include_self])

Returns tips descended from self, [] if self is a tip.

tips_within_distance(distance)

Returns tips within specified distance from self

to_json()

returns json formatted string {'newick': with edges and distances, 'edge_attributes': }

to_rich_dict()

returns {'newick': with node names, 'edge_attributes': {'tip1': {'length': ...}, ...}}

total_descending_branch_length()

Returns total descending branch length from self

total_length()

returns the sum of all branch lengths in tree

traverse([self_before, self_after, include_self])

Returns iterator over descendants.

tree_distance(other[, method])

Return the specified tree distance between this and another tree.

unrooted()

A tree with at least 3 children at the root.

write(filename[, with_distances, format])

Save the tree to filename

get_node_matching_name

unrooted_deepcopy

ancestors()#

Returns all ancestors back to the root. Dynamically calculated.

append(i)#

Appends i to self.children, in-place, cleaning up refs.

ascii_art(show_internal=True, compact=False)#

Returns a string containing an ascii drawing of the tree.

Parameters:
show_internal

includes internal edge names.

compact

use exactly one line per tip.

balanced()#

Tree ‘rooted’ here with no neighbour having > 50% of the edges.

Usage:

Using a balanced tree can substantially improve performance of the likelihood calculations. Note that the resulting tree has a different orientation with the effect that specifying clades or stems for model parameterisation should be done using the ‘outgroup_name’ argument.

bifurcating(eps=None, constructor=None, name_unnamed=False)#

Wrap multifurcating with a num of 2

child_groups()#

Returns list containing lists of children sharing a state.

In other words, returns runs of tip and nontip children.

child_parent_map() dict[str, str]#

return dict of {<child name>: <parent name>, …}

compare_by_names(other)#

Equality test for trees by name

compare_by_subsets(other, exclude_absent_taxa=False)#

Returns fraction of overlapping subsets where self and other differ.

Other is expected to be a tree object compatible with PhyloNode.

Note: names present in only one of the two trees will count as mismatches: if you don’t want this behavior, strip out the non-matching tips first.

compare_by_tip_distances(other, sample=None, dist_f=<function distance_from_r>, shuffle_f=<bound method Random.shuffle of <random.Random object>>)#

Compares self to other using tip-to-tip distance matrices.

Value returned is dist_f(m1, m2) for the two matrices. Default is to use the Pearson correlation coefficient, with +1 giving a distance of 0 and -1 giving a distance of +1 (the madimum possible value). Depending on the application, you might instead want to use distance_from_r_squared, which counts correlations of both +1 and -1 as identical (0 distance).

Note: automatically strips out the names that don’t match (this is necessary for this method because the distance between non-matching names and matching names is undefined in the tree where they don’t match, and because we need to reorder the names in the two trees to match up the distance matrices).

compare_name(other)#

Compares TreeNode by name

copy(memo=None, _nil=None, constructor='ignored')#

Returns a copy of self using an iterative approach

copy_topology(constructor=None)#

Copies only the topology and labels of a tree, not any extra data.

Useful when you want another copy of the tree with the same structure and labels, but want to e.g. assign different branch lengths and environments. Does not use deepcopy from the copy module, so _much_ faster than the copy() method.

deepcopy(memo=None, _nil=None, constructor='ignored')#

Returns a copy of self using an iterative approach

descendant_array(tip_list=None)#

Returns numpy array with nodes in rows and descendants in columns.

A value of 1 indicates that the decendant is a descendant of that node/ A value of 0 indicates that it is not

Also returns a list of nodes in the same order as they are listed in the array.

tip_list is a list of the names of the tips that will be considered, in the order they will appear as columns in the final array. Internal nodes will appear as rows in preorder traversal order.

distance(other)#

Returns branch length between self and other.

extend(items)#

Extends self.children by items, in-place, cleaning up refs.

get_connecting_edges(name1, name2)#

returns a list of edges connecting two nodes.

If both are tips, the LCA is excluded from the result.

get_connecting_node(name1, name2)#

Finds the last common ancestor of the two named edges.

get_distances(endpoints=None)#

The distance matrix as a dictionary.

Usage:

Grabs the branch lengths (evolutionary distances) as a complete matrix (i.e. a,b and b,a).

get_edge_names(tip1name, tip2name, clade=True, stem=False, outgroup_name=None)#

Return the list of stem and/or sub tree (clade) edge name(s). This is done by finding the common intersection, and then getting the list of names. If the clade traverses the root, then use the outgroup_name argument to ensure valid specification.

Parameters:
tip1/2name

edge 1/2 names

stem

whether the name of the clade stem edge is returned.

clade

whether the names of the edges within the clade are returned

outgroup_name

if provided the calculation is done on a version of the tree re-rooted relative to the provided tip.

Usage:

The returned list can be used to specify subtrees for special parameterisation. For instance, say you want to allow the primates to have a different value of a particular parameter. In this case, provide the results of this method to the parameter controller method set_param_rule() along with the parameter name etc..

get_edge_vector(include_root=True)#

Collect the list of edges in postfix order

Parameters:
include_root

specifies whether root edge included

get_figure(style='square', **kwargs)#

gets Dendrogram for plotting the phylogeny

Parameters:
stylestring

‘square’, ‘angular’, ‘radial’ or ‘circular’

kwargs

arguments passed to Dendrogram constructor

get_max_tip_tip_distance()#

Returns the max tip-to-tip distance between any pair of tips

Returns (dist, tip_names, internal_node)

get_newick(with_distances=False, semicolon=True, escape_name=True, with_node_names=False)#

Return the newick string of node and its descendents

Parameters:
with_distances

include value of node length attribute if present.

semicolon

end tree string with a semicolon

escape_name

if any of these characters []’”(), nodes name, wrap the name in single quotes

with_node_names

includes internal node names (except ‘root’)

get_node_matching_name(name)#
get_node_names(includeself=True, tipsonly=False)#

Return a list of edges from this edge - may or may not include self. This node (or first connection) will be the first, and then they will be listed in the natural traverse order.

Parameters:
includeselfbool

excludes self.name from the result

tipsonlybool

only tips returned

get_nodes_dict()#

Returns a dict keyed by node name, value is node

Will raise TreeError if non-unique names are encountered

get_param_value(param, edge)#

returns the parameter value for named edge

get_sub_tree(name_list, ignore_missing=False, keep_root=False, tipsonly=False)#

A new instance of a sub tree that contains all the otus that are listed in name_list.

Parameters:
ignore_missing

if False, get_sub_tree will raise a ValueError if name_list contains names that aren’t nodes in the tree

keep_root

if False, the root of the subtree will be the last common ancestor of all nodes kept in the subtree. Root to tip distance is then (possibly) different from the original tree. If True, the root to tip distance remains constant, but root may only have one child node.

tipsonly

only tip names matching name_list are allowed

get_tip_names(includeself=False)#

return the list of the names of all tips contained by this edge

get_xml()#

Return XML formatted tree string.

index_in_parent()#

Returns index of self in parent.

insert(index, i)#

Inserts an item at specified position in self.children.

is_root()#

Returns True if the current is a root, i.e. has no parent.

is_tip()#

Returns True if the current node is a tip, i.e. has no children.

isroot()#

Returns True if root of a tree, i.e. no parent.

istip()#

Returns True if is tip, i.e. no children.

iter_nontips(include_self=False)#

Iterates over nontips descended from self, [] if none.

include_self, if True (default is False), will return the current node as part of the list of nontips if it is a nontip.

iter_tips(include_self=False)#

Iterates over tips descended from self, [] if self is a tip.

last_common_ancestor(other)#

Finds last common ancestor of self and other, or None.

Always tests by identity.

lca(other)#

Finds last common ancestor of self and other, or None.

Always tests by identity.

property length#
levelorder(include_self=True)#

Performs levelorder iteration over tree

lin_rajan_moret(tree2) int#

return the lin-rajan-moret distance between trees

float

the Lin-Rajan-Moret distance

Notes

This is a distance measure that exhibits superior statistical properties compared to Robinson-Foulds. It can only be applied to unrooted trees.

see: Lin et al. 2012 A Metric for Phylogenetic Trees Based on Matching IEEE/ACM Transactions on Computational Biology and Bioinformatics vol. 9, no. 4, pp. 1014-1022, July-Aug. 2012

lowest_common_ancestor(tipnames)#

Lowest common ancestor for a list of tipnames

This should be around O(H sqrt(n)), where H is height and n is the number of tips passed in.

make_tree_array(dec_list=None)#

Makes an array with nodes in rows and descendants in columns.

A value of 1 indicates that the decendant is a descendant of that node/ A value of 0 indicates that it is not

also returns a list of nodes in the same order as they are listed in the array

max_tip_tip_distance()#

returns the max distance between any pair of tips

Also returns the tip names that it is between as a tuple

multifurcating(num, eps=None, constructor=None, name_unnamed=False)#

return a new tree with every node having num or few children

Parameters:
numint

the number of children a node can have max

epsfloat

default branch length to set if self or constructor is of PhyloNode type

constructor

a TreeNode or subclass constructor. If None, uses self

name_unnamedbool

names unnamed nodes

name_unnamed_nodes()#

sets the Data property of unnamed nodes to an arbitrary value

Internal nodes are often unnamed and so this function assigns a value for referencing.

non_tip_children()#

Returns direct children in self that have descendants.

nontips(include_self=False)#

Returns nontips descended from self.

property parent#

Accessor for parent.

If using an algorithm that accesses parent a lot, it will be much faster to access self._parent directly, but don’t do it if mutating self._parent! (or, if you must, remember to clean up the refs).

pop(index=-1)#

Returns and deletes child of self at index (default: -1)

postorder(include_self=True)#

performs postorder iteration over tree.

Notes

This is somewhat inelegant compared to saving the node and its index on the stack, but is 30% faster in the average case and 3x faster in the worst case (for a comb tree).

pre_and_postorder(include_self=True)#

Performs iteration over tree, visiting node before and after.

preorder(include_self=True)#

Performs preorder iteration over tree.

prune()#

Reconstructs correct tree after nodes have been removed.

Internal nodes with only one child will be removed and new connections and Branch lengths will be made to reflect change.

reassign_names(mapping, nodes=None)#

Reassigns node names based on a mapping dict

mapping : dict, old_name -> new_name nodes : specific nodes for renaming (such as just tips, etc…)

remove(target)#

Removes node by name instead of identity.

Returns True if node was present, False otherwise.

remove_deleted(is_deleted)#

Removes all nodes where is_deleted tests true.

Internal nodes that have no children as a result of removing deleted are also removed.

remove_node(target)#

Removes node by identity instead of value.

Returns True if node was present, False otherwise.

root()#

Returns root of the tree self is in. Dynamically calculated.

root_at_midpoint()#

return a new tree rooted at midpoint of the two tips farthest apart

this fn doesn’t preserve the internal node naming or structure, but does keep tip to tip distances correct. uses unrooted_deepcopy()

rooted_at(edge_name)#

Return a new tree rooted at the provided node.

Usage:

This can be useful for drawing unrooted trees with an orientation that reflects knowledge of the true root location.

rooted_with_tip(outgroup_name)#

A new tree with the named tip as one of the root’s children

same_shape(other)#

Ignores lengths and order, so trees should be sorted first

same_topology(other)#

Tests whether two trees have the same topology.

scale_branch_lengths(max_length=100, ultrametric=False)#

Scales BranchLengths in place to integers for ascii output.

Warning: tree might not be exactly the length you specify.

Set ultrametric=True if you want all the root-tip distances to end up precisely the same.

separation(other)#

Returns number of edges separating self and other.

set_max_tip_tip_distance()#

Propagate tip distance information up the tree

This method was originally implemented by Julia Goodrich with the intent of being able to determine max tip to tip distances between nodes on large trees efficiently. The code has been modified to track the specific tips the distance is between

set_param_value(param, edge, value)#

set’s the value for param at named edge

set_tip_distances()#

Sets distance from each node to the most distant tip.

siblings()#

Returns all nodes that are children of the same parent as self.

Note: excludes self from the list. Dynamically calculated.

sorted(sort_order=None)#

An equivalent tree sorted into a standard order. If this is not specified then alphabetical order is used. At each node starting from root, the algorithm will try to put the descendant which contains the lowest scoring tip on the left.

subset()#

Returns set of names that descend from specified node

subsets()#

Returns all sets of names that come from specified node and its kids

tip_children()#

Returns direct children of self that are tips.

tip_to_tip_distances(endpoints=None, default_length=1)#

Returns distance matrix between all pairs of tips, and a tip order.

Warning: .__start and .__stop added to self and its descendants.

tip_order contains the actual node objects, not their names (may be confusing in some cases).

tips(include_self=False)#

Returns tips descended from self, [] if self is a tip.

tips_within_distance(distance)#

Returns tips within specified distance from self

Branch lengths of None will be interpreted as 0

to_json()#

returns json formatted string {‘newick’: with edges and distances, ‘edge_attributes’: }

to_rich_dict()#

returns {‘newick’: with node names, ‘edge_attributes’: {‘tip1’: {‘length’: …}, …}}

total_descending_branch_length()#

Returns total descending branch length from self

total_length()#

returns the sum of all branch lengths in tree

traverse(self_before=True, self_after=False, include_self=True)#

Returns iterator over descendants. Iterative: safe for large trees.

self_before includes each node before its descendants if True. self_after includes each node after its descendants if True. include_self includes the initial node if True.

self_before and self_after are independent. If neither is True, only terminal nodes will be returned.

Note that if self is terminal, it will only be included once even if self_before and self_after are both True.

This is a depth-first traversal. Since the trees are not binary, preorder and postorder traversals are possible, but inorder traversals would depend on the data in the tree and are not handled here.

tree_distance(other: TreeNode, method: str | None = None) int#

Return the specified tree distance between this and another tree.

Defaults to the Lin-Rajan-Moret distance on unrooted trees. Defaults to the Matching Cluster distance on rooted trees.

Parameters:
other: TreeNode

The other tree to calculate the distance between.

method: str | None

The tree distance metric to use.

Options are: “rooted_robinson_foulds”: The Robinson-Foulds distance for rooted trees. “unrooted_robinson_foulds”: The Robinson-Foulds distance for unrooted trees. “matching_cluster”: The Matching Cluster distance for rooted trees. “lin_rajan_moret”: The Lin-Rajan-Moret distance for unrooted trees. “rrf”: An alias for rooted_robinson_foulds. “urf”: An alias for unrooted_robinson_foulds. “mc”: An alias for matching_cluster. “lrm”: An alias for lin_rajan_moret. “rf”: The unrooted/rooted Robinson-Foulds distance for unrooted/rooted trees. “matching”: The Lin-Rajan-Moret/Matching Cluster distance for unrooted/rooted trees.

Default is “matching”.

Returns:
int

the chosen distance between the two trees.

Notes

The Lin-Rajan-Moret distance [2] and Matching Cluster distance [1] display superior statistical properties than the Robinson-Foulds distance [3] on unrooted and rooted trees respectively.

References

[1]

Bogdanowicz, D., & Giaro, K. (2013). On a matching distance between rooted phylogenetic trees. International Journal of Applied Mathematics and Computer Science, 23(3), 669-684.

[2]

Lin et al. 2012 A Metric for Phylogenetic Trees Based on Matching IEEE/ACM Transactions on Computational Biology and Bioinformatics vol. 9, no. 4, pp. 1014-1022, July-Aug. 2012

[3]

Robinson, David F., and Leslie R. Foulds. Comparison of phylogenetic trees. Mathematical biosciences 53.1-2 (1981): 131-147.

unrooted()#

A tree with at least 3 children at the root.

unrooted_deepcopy(constructor=None, parent=None)#
write(filename, with_distances=True, format=None)#

Save the tree to filename

Parameters:
filename

self

with_distances

whether branch lengths are included in string.

format

default is newick, xml and json are alternate. Argument overrides the filename suffix. All attributes are saved in the xml format. Value overrides the file name suffix.

Notes

Only the cogent3 json and xml tree formats are supported.