Testing a hypothesis – non-stationary or time-reversible#

We test the hypothesis that the GTR model is sufficient for a data set, compared with the GN (non-stationary general nucleotide model).

from cogent3 import get_app

loader = get_app("load_aligned", format_name="fasta", moltype="dna")
aln = loader("data/primate_brca1.fasta")

tree = "data/primate_brca1.tree"

null = get_app("model", "GTR", tree=tree, optimise_motif_probs=True)
alt = get_app("model", "GN", tree=tree, optimise_motif_probs=True)
hyp = get_app("hypothesis", null, alt)
result = hyp(aln)
type(result)

cogent3.app.result.hypothesis_result

result is a hypothesis_result object. The repr() displays the likelihood ratio test statistic, degrees of freedom and associated p-value>

result

Statistics
LR	df	pvalue
9.3813	6	0.1532

hypothesis	key	lnL	nfp	DLC	unique_Q
null	'GTR'	-6992.5769	19	True	True
alt	'GN'	-6987.8862	25	True	True

In this case, we accept the null given the p-value is > 0.05. We use this object to demonstrate the properties of a hypothesis_result.

`hypothesis_result` has attributes and keys#

Accessing the test statistics#

result.LR, result.df, result.pvalue

(9.381296736692093, 6, np.float64(0.15324238178249514))

The null hypothesis#

This model is accessed via the null attribute.

result.null

GTR
key	lnL	nfp	DLC	unique_Q
'GTR'	-6992.5769	19	True	True

result.null.lf

GTR

log-likelihood = -6992.5769

number of free parameters = 19

Global params
A/C	A/G	A/T	C/G	C/T
1.23	5.25	0.95	2.34	5.97

Edge params
edge	parent	length
Galago	root	0.17
HowlerMon	root	0.04
Rhesus	edge.3	0.02
Orangutan	edge.2	0.01
Gorilla	edge.1	0.00
Human	edge.0	0.01
Chimpanzee	edge.0	0.00
edge.0	edge.1	0.00
edge.1	edge.2	0.00
edge.2	edge.3	0.01
edge.3	root	0.01

Motif params
A	C	G	T
0.38	0.17	0.21	0.24

The alternate hypothesis#

result.alt.lf

GN

log-likelihood = -6987.8862

number of free parameters = 25

Global params
A>C	A>G	A>T	C>A	C>G	C>T	G>A	G>C	G>T	T>A	T>C
0.87	3.67	0.91	1.59	2.13	6.03	8.22	1.23	0.63	1.25	3.41

Edge params
edge	parent	length
Galago	root	0.17
HowlerMon	root	0.04
Rhesus	edge.3	0.02
Orangutan	edge.2	0.01
Gorilla	edge.1	0.00
Human	edge.0	0.01
Chimpanzee	edge.0	0.00
edge.0	edge.1	0.00
edge.1	edge.2	0.00
edge.2	edge.3	0.01
edge.3	root	0.01

Motif params
A	C	G	T
0.38	0.18	0.21	0.24

Saving hypothesis results#

You are advised to save these results as serialised data since this provides maximum flexibility for downstream analyses.

The following would write the result into a sqlitedb.

from cogent3 import get_app, open_data_store

output = open_data_store("path/to/myresults.sqlitedb", mode="w")
writer = get_app("write_db", data_store=output)
writer(result)