Writing tabular data#
With the write_tabular
app, cogent3
“TabularTypes” (Table
, DictArray
, DistanceMatrix
) are supported for writing to disk.
Let’s generate a cogent3
Table
to use in the examples below. One way to do that is by applying the tabulate_stats
app to a model result.
from cogent3 import get_app
# load alignment
load_aligned_app = get_app("load_aligned", moltype="dna")
aln = load_aligned_app("data/primate_brca1.fasta")
# fit GN model
gn_model_app = get_app("model", "GN", tree="data/primate_brca1.tree")
model_result = gn_model_app(aln)
# tabulate the model result
tabulator = get_app("tabulate_stats")
model_result_tab = tabulator(model_result)
motif_params = model_result_tab["motif params"]
print(type(motif_params))
motif_params
<class 'cogent3.util.table.Table'>
A | C | G | T |
---|---|---|---|
0.3757 | 0.1742 | 0.2095 | 0.2406 |
1 rows x 4 columns
Writing a CSV file#
To write in CSV format, we create the write_tabular
app with format="csv"
.
from cogent3 import get_app, open_data_store
out_dstore = open_data_store(path_to_dir, mode="w", suffix="csv")
write_tabular_app = get_app("write_tabular", data_store=out_dstore, format="csv")
write_tabular_app(motif_params, identifier="gn_model_results.csv")
DataMember(data_store=/home/runner/work/cogent3.github.io/cogent3.github.io/c3org/doc/doc/tmpcv0f8r7v, unique_id=gn_model_results.csv)
Writing a TSV file#
To write in TSV format, we create the write_tabular
app with format="tsv"
.
from cogent3 import get_app, open_data_store
out_dstore = open_data_store(path_to_dir, mode="w", suffix="tsv")
write_tabular_app = get_app("write_tabular", data_store=out_dstore, format="tsv")
write_tabular_app(motif_params, identifier="gn_model_results.tsv")
DataMember(data_store=/home/runner/work/cogent3.github.io/cogent3.github.io/c3org/doc/doc/tmpm1w9scw3, unique_id=gn_model_results.tsv)
Using write_tabular
in a composed process#
Instead of applying the apps sequentially as above, we can add apps into a composed process, and apply the process to a data store. In this example, we define a process that calculates an unaligned distance measure between sequences, writing these estimated distances to a tsv file.
from cogent3 import get_app, open_data_store
loader = get_app("load_unaligned", moltype="dna")
jdist = get_app("jaccard_dist")
out_dstore = open_data_store(path_to_dir, mode="w", suffix="tsv")
writer = get_app("write_tabular", data_store=out_dstore, format="tsv")
process = loader + jdist + writer
in_dstore = open_data_store("data", suffix="fasta", mode="r", limit=2)
result = process.apply_to(in_dstore)
result.describe
record type | number |
---|---|
completed | 1 |
not_completed | 1 |
logs | 1 |
3 rows x 2 columns
Tip
When running this code on your machine, remember to replace path_to_dir
with an actual directory path.