RelationsMatrix

Class to store and save ontology relations matrices.

Attributes:
  • matrix (NDArray) –

    A terms x terms matrix storing binary relationships between term pairs. You may interpret it as the following: For any row, column pair, if the value is 1, then the term representing that particular row is an ancestor of the term representing that particular column. If the value is 0, then there is no relationship between the terms.

  • terms (NDArray) –

    An array representing the columns and rows of the matrix.

RelationsLazyFrame

Loader for the MetaHQ setup package ontology relations DataFrames.

Performs fast lookup of precomputed ancestor/descendant relationships of nodes within the ontology graph.

Relations DataFrame structure

The values of a collected frame answers the following questions with a 1 for 'yes' and 0 for 'no': Is a particular row an ancestor of a particular column? Is a particular column a descendant of a particular row?

get_ancestors(subset=None, rm_self=False)

Extract relationships of terms to their ancestors.

Note that terms queried for their ancestors are included in the output mapping.

Parameters:
  • subset (list[str] | None, default: None ) –

    Can be a list of Term IDs in the columns of the relations .parquet file. Default is None. Extracts all relationships if None.

  • rm_self (bool, default: False ) –

    If True, will remove the term ID representing a particular key from the values of that same key.

Returns:
  • dict

    A dictionary of term: [ancestors, ...] relationships.

get_descendants(subset=None, rm_self=True)

Extract relationships of terms to their descendants.

Note that terms queried for their ancestors are included in the output mapping.

Parameters:
  • subset (list[str] | None, default: None ) –

    Can be a list of Term IDs in the columns of the relations .parquet file. Default is None. Extracts all relationships if None.

  • rm_self (bool, default: True ) –

    If True, will remove the term ID representing a particular key from the values of that same key.

Returns:
  • dict

    A dictionary of term: [descendants, ...] relationships.

from_parquet(file) classmethod

Load and format the relations dataframe.

This loads in a base pl.LazyFrame to extract ancestor relationstips, descendant relationships, or both within using a single RelationLoader instance.

Returns:
  • DataFrame

    pl.DataFrame with an additional temporary 'index' column indicating the term ID for each row.

Raises:
  • PolarsError

    if file reading fails.