BasicAnnotationDb#
- class BasicAnnotationDb(*args: Any, **kwargs: Any)#
Provides a user table for annotations. This can be merged with either the Gff or Genbank versions.
- Attributes:
- db
describetop level description of the annotation db
- table_names
Methods
add_feature(*, seqid, biotype, name, spans)adds a record to user table
return counts of biological types across all tables and seqids
close()closes the db
compatible(other_db[, symmetric])checks whether table_names are compatible
count_distinct(*[, seqid, biotype, name])return table of counts of distinct values
get_feature_children(name[, biotype])yields children of name
get_feature_parent(name, **kwargs)yields parents of name
get_records_matching(*[, biotype, seqid, ...])return all fields for matching records
adds db indexes for core attributes
num_matches(*[, seqid, biotype, name, ...])return the number of records matching condition
subset(*[, source, biotype, seqid, name, ...])returns a new db instance with records matching the provided conditions
returns a dict suitable for json serialisation
union(annot_db)returns a new instance with merged records with other
update(annot_db[, seqids])update records with those from an instance of the same type
write(path)writes db as bytes to path
add_records
from_dict
get_features_matching
to_json
Notes
This is the default db on Sequence, SequenceCollection and Alignment
- StrOrBool = str | bool#
- add_feature(*, seqid: str, biotype: str, name: str, spans: list[tuple[int, int]] | ndarray[tuple[Any, ...], dtype[integer]], parent_id: str | None = None, strand: str | int | Strand | None = None, attributes: str | None = None, on_alignment: bool | None = False) None#
adds a record to user table
- Parameters:
- seqid
name of the sequence feature resides on
- biotype
biological type of the record
- name
the name of a record, an identifier
- spans
this will be sorted
- strand
either +, -. Defaults to ‘+’
- attributes
additional attributes as a string
- on_alignment
whether the annotation is an alignment annotation
- add_records(records: Iterable[dict[str, Any]], **kwargs: Any) None#
- biotype_counts() dict[str, int]#
return counts of biological types across all tables and seqids
- close() None#
closes the db
- compatible(other_db: SqliteAnnotationDbMixin, symmetric: bool = True) bool#
checks whether table_names are compatible
- Parameters:
- other_db
the other annotation db instance
- symmetric
checks only that tables of other_db equal, or are a subset, of mine
- count_distinct(*, seqid: StrOrBool = False, biotype: StrOrBool = False, name: StrOrBool = False) Table | None#
return table of counts of distinct values
- Parameters:
- seqid, biotype, name
if a string, selects the subset of rows matching the provided values and counts distinct values for the other fields whose value is True.
- Returns:
- Table with columns corresponding to argument whose value was True
Examples
To compute copy number by gene name within each genome
>>> counts_table = db.count_distinct(seqid=True, biotype="gene", name=True)
- property db: Connection#
- classmethod from_dict(data: dict[str, Any]) Self#
- get_feature_children(name: str, biotype: str | None = None, **kwargs: Any) Iterator[FeatureDataType]#
yields children of name
- get_feature_parent(name: str, **kwargs: Any) Iterator[FeatureDataType]#
yields parents of name
- get_features_matching(*, biotype: str | None = None, seqid: str | None = None, name: str | None = None, start: int | None = None, stop: int | None = None, strand: str | None = None, attributes: str | None = None, on_alignment: bool | None = None, allow_partial: bool = False) Iterator[FeatureDataType]#
- get_records_matching(*, biotype: str | None = None, seqid: str | None = None, name: str | None = None, start: int | None = None, stop: int | None = None, strand: str | None = None, attributes: str | None = None, on_alignment: bool | None = None, allow_partial: bool = False) Iterator[dict[str, Any]]#
return all fields for matching records
- make_indexes() None#
adds db indexes for core attributes
- num_matches(*, seqid: str | None = None, biotype: str | None = None, name: str | None = None, strand: str | None = None, attributes: str | None = None, on_alignment: bool | None = None) int#
return the number of records matching condition
- subset(*, source: str | PathLike[Any] | PurePath | Path = ':memory:', biotype: str | None = None, seqid: str | None = None, name: str | None = None, start: int | None = None, stop: int | None = None, strand: str | None = None, attributes: str | None = None, allow_partial: bool = False) Self#
returns a new db instance with records matching the provided conditions
- property table_names: tuple[str, ...]#
- to_json() str#
- to_rich_dict() dict[str, Any]#
returns a dict suitable for json serialisation
- union(annot_db: SqliteAnnotationDbMixin) SqliteAnnotationDbMixin#
returns a new instance with merged records with other
- Parameters:
- annot_db
an annotation db whose schema is either a subset, or superset of self
- Returns:
- The class whose schema contains the other
- update(annot_db: SqliteAnnotationDbMixin, seqids: str | list[str] | None = None, **kwargs: Any) None#
update records with those from an instance of the same type
- write(path: str | PathLike[Any] | PurePath | Path) None#
writes db as bytes to path