A class to store and operate on ID columns for tabular data.
Specifically made as an index for polars.DataFrame objects which
lack index anchoring and tracking.
| Attributes: |
|
|---|
Examples:
>>> from metahq_core.curations.index import Ids
>>> ids = pl.DataFrame({
"sample": ["GSM1", "GSM2", "GSM3"],
"series": ["GSE1", "GSE1", "GSE2"],
"platform": ["GPL10", "GPL10", "GPL23"],
})
>>> ids = ids.from_dataframe(ids, index_col="sample")
columns
property
¶
Returns columns of self.data.
Wrapper for polars.DataFrame.columns.
index
property
¶
Get the index column as a Series.
Examples:
>>> import polars as pl
>>> from metahq_core.curations.index import Ids
>>> ids = pl.DataFrame({
"sample": ["GSM1", "GSM2", "GSM3"],
"series": ["GSE1", "GSE1", "GSE2"],
"platform": ["GPL10", "GPL10", "GPL23"],
})
>>> Ids.from_dataframe(ids, index_col="sample")
shape: (3,)
Series: 'sample' [str]
[
"GSM1"
"GSM2"
"GSM3"
]
filter_by_mask(mask)
¶
Filter the ids DataFrame using a boolean mask.
| Parameters: |
|
|---|
Examples:
>>> from metahq_core.curations.index import Ids
>>> ids = pl.DataFrame({
"sample": ["GSM1", "GSM2", "GSM3"],
"series": ["GSE1", "GSE1", "GSE2"],
"platform": ["GPL10", "GPL10", "GPL23"],
})
>>> ids = Ids.from_dataframe(ids, index_col="sample")
>>> ids.filter_by_mask(np.array([1, 2])).data
┌────────┬────────┬──────────┐
│ sample ┆ series ┆ platform │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞════════╪════════╪══════════╡
│ GSM2 ┆ GSE1 ┆ GPL10 │
│ GSM3 ┆ GSE2 ┆ GPL23 │
└────────┴────────┴──────────┘
lazy()
¶
Wrapper for polars.DataFrame.lazy().
| Returns: |
|
|---|
to_numpy()
¶
Wrapper for polars.DataFrame.to_numpy().
| Returns: |
|
|---|
from_dataframe(df, index_col)
classmethod
¶
Creates an Ids object from a polars DataFrame.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Examples:
__getitem__(idx)
¶
Slice the Ids frame with various indexing methods.
__len__()
¶
Return the number of rows.