make_unaligned_seqs#

make_unaligned_seqs(data: SeqsDataABC, *, moltype: Literal['dna', 'rna', 'protein', 'protein_with_stop', 'text', 'bytes'] | MolType, label_to_name: Callable[[str], str] | None = None, info: dict | None = None, source: str | Path | None = None, annotation_db: SupportsFeatures = None, offset: dict[str, int] | None = None, name_map: dict[str, str] | None = None, is_reversed: bool = False) → SequenceCollection

Initialise an unaligned collection of sequences.

Parameters:

data: sequence data, a SeqsData, a dict {name: seq, …}, an iterable of sequences
moltype: string representation of the moltype, e.g., ‘dna’, ‘protein’.
label_to_name: function for converting original names into other names.
info: a dict from which to make an info object
source: origins of this data, defaults to ‘unknown’. Converted to a string and added to info[“source”].
annotation_db: annotation database to attach to the collection
offset: a dict mapping names to annotation offsets
name_map: a dict mapping sequence names to “parent” sequence names. The parent name will be used for querying a annotation_db.
is_reversed: entire collection has been reverse complemented
reversed_seqs: set of names that are on the reverse strand of the parent sequence
storage_backend: name of the storage backend to use for the SeqsData object, defaults to cogent3 builtin.
kwargs: keyword arguments for the storage driver

Notes

If no annotation_db is provided, but the sequences are annotated, an annotation_db is created by merging any annotation db’s found in the sequences. If the sequences are annotated AND an annotation_db is provided, only the annotation_db is used.