Sample an alignment to a fixed length#
Let’s load in an alignment of rodents to use in the examples.
from cogent3 import get_app
loader = get_app("load_aligned", moltype="protein", format_name="phylip")
aln = loader("data/abglobin_aa.phylip")
aln
0 | |
goat-cow | VLSAADKSNVKAAWGKVGGNAGAYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGE |
human | ...P...T..........AH..E....................................K |
rabbit | ...P...T.I.T..E.I.SHG.E.....V.....G............FT...E.I.A..K |
rat | ....D..T.I.NC...I..HG.E..E...Q...AA........S.I.V.P......A..K |
marsupial | ...D...TH...I......H....A....A.T.................P....IQ...K |
5 x 285 (truncated to 5 x 60) protein alignment
How to sample the first n
positions of an alignment#
We can use the fixed_length
app to sample an alignment to a fixed length. By default, it will sample from the beginning of an alignment, the argument length=20
specifies how many positions to sample.
from cogent3 import get_app
first_20 = get_app("fixed_length", length=20)
first_20(aln)
0 | |
goat-cow | VLSAADKSNVKAAWGKVGGN |
human | ...P...T..........AH |
rabbit | ...P...T.I.T..E.I.SH |
rat | ....D..T.I.NC...I..H |
marsupial | ...D...TH...I......H |
5 x 20 protein alignment
How to sample n
positions from within an alignment#
Creating the fixed_length
app with the argument start=x
specifies that the sampled sequence should begin x
positions into the alignment.
from cogent3 import get_app
skip_10_take_20 = get_app("fixed_length", length=20, start=10)
skip_10_take_20(aln)
0 | |
goat-cow | KAAWGKVGGNAGAYGAEALE |
human | ........AH..E....... |
rabbit | .T..E.I.SHG.E.....V. |
rat | .NC...I..HG.E..E...Q |
marsupial | ..I......H....A....A |
5 x 20 protein alignment
How to sample n
positions randomly from within an alignment#
The start position can be selected at random with random=True
. An optional seed
can be provided to ensure the same start position is used when the app is called.
from cogent3 import get_app
random_20 = get_app("fixed_length", length=20, random=True)
random_20(aln)
0 | |
goat-cow | LKSKTSFVTLREAANGVAGA |
human | ............P....... |
rabbit | ..G..............S.. |
rat | ..A.A......D.....G.. |
marsupial | ...QS....M.GP....... |
5 x 20 protein alignment