Sirota 2011

Bases: BaseProcessor

Processor for Sirota et al. 2011 annotations.

Explodes comma-separated GSM lists from each GDS row into individual sample records, joins with manually-curated UMLS → MONDO and UMLS → UBERON mapping files, and filters to system-level ontology descendants.

`process(output_dir, **kwargs)` ¶

Process Sirota 2011 CSV into standardized sample annotations.

Parameters:	`output_dir` (`Path`) – Directory where the processed parquet file will be written. `kwargs`** (`Any`, default: `{}` ) – `input_path` (Path \| str) — override the default input path (defaults to `SIROTA_2011_CSV` from config).

Returns:	`DataFrame` – Standardized annotations with columns `sample_id`, `annotation_type`, `term_id`, `term_label`, and `ecode`.

`validate(data)` ¶

Validate processed Sirota 2011 data.

Parameters:	`data` (`DataFrame`) – Processed annotations DataFrame to validate.

Returns:	`bool` – True if validation passes.

Raises:	`ValidationError` – If required columns are missing.

process(output_dir, **kwargs) ¶

validate(data) ¶

`process(output_dir, **kwargs)` ¶

`validate(data)` ¶