make_unaligned_seqs#
- make_unaligned_seqs(data: dict[str, str | bytes | ndarray[int]] | list | SeqsDataABC, *, moltype: Literal['dna', 'rna', 'protein', 'protein_with_stop', 'text', 'bytes'] | MolType, label_to_name: Callable[[str], str] | None = None, info: dict | None = None, source: str | Path | None = None, annotation_db: SupportsFeatures | None = None, offset: dict[str, int] | None = None, name_map: dict[str, str] | None = None, is_reversed: bool = False, reversed_seqs: set[str] | None = None, storage_backend: str | None = None, **kwargs) SequenceCollection #
- make_unaligned_seqs(data: SeqsDataABC, *, moltype: Literal['dna', 'rna', 'protein', 'protein_with_stop', 'text', 'bytes'] | MolType, label_to_name: Callable[[str], str] | None = None, info: dict | None = None, source: str | Path | None = None, annotation_db: SupportsFeatures = None, offset: dict[str, int] | None = None, name_map: dict[str, str] | None = None, is_reversed: bool = False) SequenceCollection
Initialise an unaligned collection of sequences.
- Parameters:
- data
sequence data, a SeqsData, a dict {name: seq, …}, an iterable of sequences
- moltype
string representation of the moltype, e.g., ‘dna’, ‘protein’.
- label_to_name
function for converting original names into other names.
- info
a dict from which to make an info object
- source
origins of this data, defaults to ‘unknown’. Converted to a string and added to info[“source”].
- annotation_db
annotation database to attach to the collection
- offset
a dict mapping names to annotation offsets
- name_map
a dict mapping sequence names to “parent” sequence names. The parent name will be used for querying a annotation_db.
- is_reversed
entire collection has been reverse complemented
- reversed_seqs
set of names that are on the reverse strand of the parent sequence
- storage_backend
name of the storage backend to use for the SeqsData object, defaults to cogent3 builtin.
- kwargs
keyword arguments for the storage driver
Notes
If no annotation_db is provided, but the sequences are annotated, an annotation_db is created by merging any annotation db’s found in the sequences. If the sequences are annotated AND an annotation_db is provided, only the annotation_db is used.