CharAlphabet#
- class CharAlphabet(chars: Sequence[str | bytes], gap: str | None = None, missing: str | None = None)#
representing fundamental monomer character sets.
- Attributes:
- gap_char
- gap_index
- missing_char
- missing_index
- moltype
- motif_len
- num_canonical
Methods
array_to_bytes
(seq)returns seq as a byte string
as_bytes
()returns self as a byte string
count
(value, /)Return number of occurrences of value.
from_rich_dict
(data)returns an instance from a serialised dictionary
get_kmer_alphabet
(k[, include_gap])returns kmer alphabet with words of size k
index
(value[, start, stop])Return first index of value.
to_json
()returns a serialisable string
returns a serialisable dictionary
with_gap_motif
([gap_char, missing_char])returns new monomer alphabet with gap and missing characters added
from_indices
get_motif_len
get_word_alphabet
is_valid
to_indices
Notes
Provides methods for efficient conversion between characters and integers from fundamental types of strings, bytes and numpy arrays.
- array_to_bytes(seq: ndarray) bytes #
returns seq as a byte string
- as_bytes() bytes #
returns self as a byte string
- count(value, /)#
Return number of occurrences of value.
- from_indices(seq: str | bytes | ndarray) str #
- from_indices(seq: str) str
- from_indices(seq: bytes) str
- from_indices(seq: ndarray) str
- classmethod from_rich_dict(data: dict) CharAlphabet #
returns an instance from a serialised dictionary
- property gap_char: str | None#
- property gap_index: int | None#
- get_kmer_alphabet(k: int, include_gap: bool = True) KmerAlphabet #
returns kmer alphabet with words of size k
- Parameters:
- k
word size
- include_gap
if True, and self.gap_char, we set KmerAlphabet.gap_char = self.gap_char * k
- get_motif_len() int #
- get_word_alphabet(k: int, include_gap: bool = True) KmerAlphabet #
- index(value, start=0, stop=sys.maxsize, /)#
Return first index of value.
Raises ValueError if the value is not present.
- is_valid(seq: str | bytes | ndarray) bool #
- is_valid(seq: str) bool
- is_valid(seq: bytes) bool
- is_valid(seq: ndarray) bool
- property missing_char: str | None#
- property missing_index: int | None#
- property motif_len: int#
- property num_canonical: int#
- to_indices(seq: str | bytes | ndarray) ndarray[int] #
- to_indices(seq: bytes) ndarray[int]
- to_indices(seq: str) ndarray[int]
- to_indices(seq: ndarray) ndarray[int]
- to_json() str #
returns a serialisable string
- to_rich_dict() dict #
returns a serialisable dictionary
- with_gap_motif(gap_char='-', missing_char='?')#
returns new monomer alphabet with gap and missing characters added
- Parameters:
- gap_char
the IUPAC gap character “-”
- missing_char
the IUPAC missing character “?”