make_unaligned_seqs#

make_unaligned_seqs(data: dict[str, str | bytes | ndarray[int]] | list | SeqsDataABC, *, moltype: Literal['dna', 'rna', 'protein', 'protein_with_stop', 'text', 'bytes'] | MolType, label_to_name: Callable[[str], str] | None = None, info: dict | None = None, source: str | Path | None = None, annotation_db: SupportsFeatures | None = None, offset: dict[str, int] | None = None, name_map: dict[str, str] | None = None, is_reversed: bool = False, reversed_seqs: set[str] | None = None, storage_backend: str | None = None, **kwargs) SequenceCollection#
make_unaligned_seqs(data: SeqsDataABC, *, moltype: Literal['dna', 'rna', 'protein', 'protein_with_stop', 'text', 'bytes'] | MolType, label_to_name: Callable[[str], str] | None = None, info: dict | None = None, source: str | Path | None = None, annotation_db: SupportsFeatures = None, offset: dict[str, int] | None = None, name_map: dict[str, str] | None = None, is_reversed: bool = False) SequenceCollection

Initialise an unaligned collection of sequences.

Parameters:
data

sequence data, a SeqsData, a dict {name: seq, …}, an iterable of sequences

moltype

string representation of the moltype, e.g., ‘dna’, ‘protein’.

label_to_name

function for converting original names into other names.

info

a dict from which to make an info object

source

origins of this data, defaults to ‘unknown’. Converted to a string and added to info[“source”].

annotation_db

annotation database to attach to the collection

offset

a dict mapping names to annotation offsets

name_map

a dict mapping sequence names to “parent” sequence names. The parent name will be used for querying a annotation_db.

is_reversed

entire collection has been reverse complemented

reversed_seqs

set of names that are on the reverse strand of the parent sequence

storage_backend

name of the storage backend to use for the SeqsData object, defaults to cogent3 builtin.

kwargs

keyword arguments for the storage driver

Notes

If no annotation_db is provided, but the sequences are annotated, an annotation_db is created by merging any annotation db’s found in the sequences. If the sequences are annotated AND an annotation_db is provided, only the annotation_db is used.