Annotations in long format.
Exists to support modularity and readibility within the Query class.
| Attributes: |
|
|---|
column_intersection_with(columns)
¶
Find intersection between columns and the columns in the annotations attribute.
| Parameters: |
|
|---|
| Returns: |
|
|---|
filter_na(column)
¶
Removes entries in a column that are NA-like values (e.g., 'NA' or 'none'). Updates the annotations attribute in place.
| Parameters: |
|
|---|
stage_anchor(anchor)
¶
Filters NA values from the anchor annotations column.
| Parameters: |
|
|---|
stage_level(level)
¶
Filters NA values from the specified ID level column. If level is 'group', then it will also remove annotations with index IDs.
| Parameters: |
|
|---|
stage(level, anchor)
¶
Stages the annotations DataFrame to be converted to wide format. Mutates the annotations attribute in place.
| Parameters: |
|
|---|
pivot_wide(level, anchor, id_cols)
¶
Pivots the to wide annotations with one-hot-encoded binary entries for each annotation.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Examples:
>>> from metahq_core.query import LongAnnotations
>>> anno = pl.DataFrame({
'sample': ['GSM1', 'GSM2', 'GSM3'],
'series': ['GSE1', 'GSE1', 'GSE2'],
'platform': ['GPL1', 'GPL2', 'GPL2'],
'id': ['UBERON:0000948|UBERON:0002349', 'UBERON:0002113', 'UBERON:0000955'],
'value': ['heart|myocardium', 'kidney', 'brain'],
})
>>> anno = LongAnnotations(anno)
>>> anno.pivot_wide(
level='sample', anchor='id', id_cols=['sample', 'series']
)
┌────────┬────────┬────────────────┬────────────────┬────────────────┬────────────────┐
│ series ┆ sample ┆ UBERON:0000948 ┆ UBERON:0002349 ┆ UBERON:0002113 ┆ UBERON:0000955 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i32 ┆ i32 ┆ i32 ┆ i32 │
╞════════╪════════╪════════════════╪════════════════╪════════════════╪════════════════╡
│ GSE1 ┆ GSM1 ┆ 1 ┆ 1 ┆ 0 ┆ 0 │
│ GSE1 ┆ GSM2 ┆ 0 ┆ 0 ┆ 1 ┆ 0 │
│ GSE2 ┆ GSM3 ┆ 0 ┆ 0 ┆ 0 ┆ 1 │
└────────┴────────┴────────────────┴────────────────┴────────────────┴────────────────┘