Create a file including citations for each source in a MetaHQ query result. A new Reference must be created for every new annotation set added to MetaHQ.

Author: Parker Hicks
Date: 2026-03-31

Last updated: 2026-04-03 by Parker Hicks

CitationConfig dataclass

Storage for attributes required to property format a reference.

Attributes:
  • version (str) –

    Version of the MetaHQ database.

  • terms (str) –

    (str): Query terms.

  • attribute (str) –

    Attribute of the queried database entries.

  • level (str) –

    Curation level.

  • species (str) –

    Species of the quereied database entries.

  • ecode (str) –

    Evidence code of the quereied database entries.

  • tech (str) –

    Technology of the quereied database entries.

  • mode (str) –

    Query mode (e.g, annotate, label).

  • license (str) –

    License filter applied to the query.

  • date (str) –

    Date formatted as 'YYYY-MM-DD HR:MIN:SEC'.

  • outfile (str | Path) –

    Outfile to save the reference to.

build_citation_file(references, config, indent='')

Build the final citation file substituting placeholder variables in the citation template.

Parameters:
  • references (str) –

    A string of formatted references from the format_references function.

  • config (CitationConfig) –

    A populated CitationConfig.

  • indent (str, default: '' ) –

    The subsequent indentation for the MetaHQ reference.

Returns:
  • str

    The populated string to save to a CITATION.txt file.

build_reference_list(value_counts)

Build a list of initialized references from a polars.DataFrame of counts for each source.

Parameters:
  • value_counts (DataFrame) –

    A polars.DataFrame with sources in the first column and counts for those sources in the second. The counts represent how many annotations a source contributed to the query.

Returns:
  • list[Reference]

    A list of reference objects from the metahq_core.sources module.

format_citation(reference)

Within this script, citations are written with newline characters to maintain readibility. This function removes the newline characters and reformats as a single stipped line.

Parameters:
  • reference (Reference) –

    A populated Reference object.

Returns:
  • str

    A citation as a single string with no extra whitespace.

format_reference(reference, index, indent=' ')

Format a reference for export to CITATION.txt.

Parameters:
  • reference (Reference) –

    A populated Reference object.

  • index (int) –

    Index of the reference in a list. Used for ordered display of references.

  • indent (str, default: ' ' ) –

    Indentation for formatting.

Returns:
  • str

    A formatted reference.

Examples:

>>> from metahq_core.sources import KrishnanLab
>>> from metahq_core.export.references import format_reference
>>> format_reference(KrishnanLab(5), 1)
[1] KrishnanLab
    Hicks, P. et al. MetaHQ: Harmonized, high-quality metadata annotations of public
    omics samples and studies. arXiv, (2026).
    url: https://doi.org/10.5281/zenodo.17663086
    Annotations: 5
    License: CC BY-NC 4.0

format_references(references)

Format a list of (reference, annotation_count) tuples.

Parameters:
  • references (list[tuple[Reference, int]]) –

    List of tuples containing (Reference, annotation_count).

Returns:
  • str

    Formatted string with all references numbered sequentially.

save_citations(source_counts, config, logger, verbose=True)

Build and save citations from a polars.DataFrame of source counts.

Parameters:
  • source_counts (DataFrame) –

    A polars.DataFrame with sources in the first column and counts for those sources in the second. The counts represent how many annotations a source contributed to the query.

  • config (CitationConfig) –

    A populated CitationConfig.

  • logger (Logger) –

    Python logger.

  • verbose (bool, default: True ) –

    Verbosity level.