SeqsData#
- class SeqsData(*, data: dict[str, str | bytes | ndarray[int]], alphabet: CharAlphabet, offset: dict[str, int] | None = None, check: bool = True, reversed_seqs: set[str] | None = None)#
The builtin
cogent3
implementation of sequence storage underlying aSequenceCollection
. The sequence data is stored as numpy arrays. Indexing this object (using an int or seq name) returns aSeqDataView
, which can realise the corresponding slice as a string, bytes, or numpy array via the alphabet.- Attributes:
alphabet
the character alphabet for validating, encoding, decoding sequences
names
returns the names of the sequences in the storage
offset
annotation offsets for each sequence
reversed_seqs
names of sequences that are reverse complemented
Methods
add_seqs
(seqs[, force_unique_keys, offset])Returns a new SeqsData object with added sequences.
copy
(**kwargs)shallow copy of self
get_seq_length
(seqid)return length for seqid
get_view
(seqid)reurns view of sequence data for seqid
from_seqs
get_seq_array
get_seq_bytes
get_seq_str
to_alphabet
Notes
Methods on this object only accepts plust strand start, stop and step indices for selecting segments of data. It can return the gap coordinates for a sequence as used by IndelMap.
- add_seqs(seqs: dict[str, str | bytes | ndarray[int]], force_unique_keys: bool = True, offset: dict[str, int] | None = None) SeqsData #
Returns a new SeqsData object with added sequences. If force_unique_keys is True, raises ValueError if any names already exist in the collection.
- property alphabet: CharAlphabet#
the character alphabet for validating, encoding, decoding sequences
- copy(**kwargs) Self #
shallow copy of self
Notes
kwargs are passed to constructor and will over-ride existing values
- classmethod from_seqs(*, data: dict[str, str | bytes | ndarray[int]], alphabet: AlphabetABC, **kwargs)#
- get_seq_array(*, seqid: str, start: int | None = None, stop: int | None = None, step: int | None = None) ndarray #
- get_seq_bytes(*, seqid: str, start: int | None = None, stop: int | None = None, step: int | None = None) bytes #
- get_seq_length(seqid: str) int #
return length for seqid
- get_seq_str(*, seqid: str, start: int | None = None, stop: int | None = None, step: int | None = None) str #
- get_view(seqid: str) SeqDataView #
reurns view of sequence data for seqid
- property names: tuple[str, ...]#
returns the names of the sequences in the storage
- property offset: dict[str, int]#
annotation offsets for each sequence
- property reversed_seqs: frozenset[str]#
names of sequences that are reverse complemented