Citation information¶
Citing MetaHQ¶
The preprint for MetaHQ can be found at https://arxiv.org/abs/2602.07805. To cite MetaHQ please use the following reference:
@misc{hicks2026metahqharmonizedhighqualitymetadata,
title={MetaHQ: Harmonized, high-quality metadata annotations of public omics samples and studies},
author={Parker Hicks and Lydia E Valtadoros and Christopher A Mancuso and Faisal Alquadoomi and Kayla A Johnson and Sneha Sundar and Arjun Krishnan},
year={2026},
eprint={2602.07805},
archivePrefix={arXiv},
primaryClass={q-bio.GN},
url={https://arxiv.org/abs/2602.07805},
}
Citing individual annotation sets¶
Many annotations in the MetaHQ database are derived from external curation efforts. We require that users cite any sources that contributed to your retrieved annotation set.
- Note: Sample and series annotation counts indicate how many individual sample- or series-level annotations a single source provides, not the number of samples or series. For example, Johnson_2023 may annotate tissue, disease, sex, and age for a single sample. This counts as four sample-level annotations.
1. ALE¶
- Source: Giles CB, et al. (2017) BMC Bioinformatics
- Citation: Giles, C. B. et al. ALE: automated label extraction from GEO metadata. BMC bioinformatics 18, 509 (2017).
- DOI: 10.1186/s12859-017-1888-1
- Rights Statement: From Rights and Permissions in paper
- Access: https://github.com/wrenlab/label-extraction/blob/master/data/manual/geo_manual_labels_jdw.tsv
2. Bgee¶
- Source: Bgee: Gene Expression Data in Animals
- Citation: Bastian FB, et al. (2021) The Bgee suite: integrated curated expression atlas and comparative transcriptomics in animals. Nucleic Acids Research 49(D1): D831–D847. https://doi.org/10.1093/nar/gkaa793
- DOI: 10.1093/nar/gkaa793
- Rights Statement: From web server
- Access: https://www.bgee.org
3. CellO¶
- Source: CellO Cell Type Classification
- Citation: Bernstein, M. N., Ma, Z., Gleicher, M. & Dewey, C. N. CellO: Comprehensive and hierarchical cell type classification of human cells with the Cell Ontology. Iscience 24 (2021).
- DOI: 10.1016/j.isci.2020.101913
- Rights Statement: From Zenodo
- Access: https://zenodo.org/records/4609473
4. CREEDS¶
- Source: CREEDS: CRowd Extracted Expression of Differential Signatures
- Citation: Wang, Z. et al. Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd. Nature communications 7, 12846 (2016).
- DOI: 10.1038/ncomms12846
- Rights Statement: From web server
- Access: https://maayanlab.cloud/CREEDS/
5. DiSignAtlas¶
- Source: DiSignAtlas
- License: Free for academic usage only (NonCommercial)
- Citation: Zhai, Z. et al. DiSignAtlas: an atlas of human and mouse disease signatures based on bulk and single-cell transcriptomics. Nucleic acids research 52, D1236–D1245 (2024).
- DOI: 10.1093/nar/gkad961
- Rights Statement: From web server
- Access: http://www.inbirg.com/disignatlas/
- Commercial Use: For commercial usage, contact Prof. Jianbo Pan.
- ⚠️ NonCommercial Restriction
6. Gemma¶
- Source: Gemma Database
- Citation: Lim, N. et al. Curation of over 10,000 transcriptomic studies to enable data reuse. Database 2021, baab006 (2021).
- DOI: 10.1093/database/baab006
- Rights Statement: From web server
- Access: https://gemma.msl.ubc.ca/home.html
- ⚠️ NonCommercial Restriction
7. Golightly_2018¶
- Source: Golightly, et al. (2018) Scientific Data
- Citation: Golightly NP, et al. (2018) Curated compendium of human transcriptional biomarker data. Scientific Data 5: 180066. https://doi.org/10.1038/sdata.2018.66
- DOI: 10.1038/sdata.2018.66
- Rights Statement: From Rights and Permissions in paper
- Access: https://osf.io/ssk3t/overview
8. Gu_2023¶
- Source: Gu, et al. (2023) Genomics, Proteomics & Bioinformatics
- Citation: Gu, J., Dai, J., Lu, H. & Zhao, H. Comprehensive analysis of ubiquitously expressed genes in humans from a data-driven perspective. Genomics, Proteomics & Bioinformatics 21, 164–176 (2023).
- DOI: 10.1016/j.gpb.2021.08.017
- Rights Statement: From Rights and Permissions in paper
- Access: Table S3 in https://academic.oup.com/gpb/article/21/1/164/7274179
9. Johnson_2023¶
- Source: Jonson and Krishnan. (2023) bioRxiv
- Citation: Johnson, K. A. & Krishnan, A. Human pan-body age-and sex-specific molecular phenomena inferred from public transcriptome data using machine learning. bioRxiv, 2023–01 (2023).
- DOI: 10.1101/2023.01.12.523796
- Rights Statement: From Copyright statement in paper
- Note: Annotations covered by paper license; GitHub repository license to be clarified
- Access: https://github.com/krishnanlab/Age-sex_signatures_in_humans_code/tree/master/data/labels/full
10. KrishnanLab¶
- Source: MetaHQ in-house curation
- Citation: Hicks, P. et al. MetaHQ: Harmonized, high-quality metadata annotations of public omics samples and studies. arXiv, (2026).
- DOI: 10.48550/arXiv.2602.07805
- Access: Published with MetaHQ. Access through the MetaHQ database at https://doi.org/10.5281/zenodo.17663086
11. Sirota_2011¶
- Source: Sirota, et al. (2011) Science translational medicine
- Citation: Sirota, M. et al. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Science translational medicine 3, 96ra77–96ra77 (2011).
- DOI: 10.1126/scitranslmed.3001318
- Access: Table S1 in https://www.science.org/doi/10.1126/scitranslmed.3001318
- Note: No explicit license declaration was found. However, this source is published under the Science Translational Medicine AAAS Open Access program that allows for CC BY and CC BY-NC licenses. We assume the strictest.
- ⚠️ NonCommercial Restriction
12. URSA¶
- Source: URSA (Unveiling RNA Sample Annotation)
- Citation: Lee, Y., Krishnan, A., Zhu, Q. & Troyanskaya, O. G. Ontology-aware classification of tissue and cell-type signals in gene expression profiles across platforms and technologies. Bioinformatics 29, 3036–3044 (2013).: 152-162.e6. https://doi.org/10.1016/j.cels.2018.12.010
- DOI: 10.1093/bioinformatics/btt529
- Rights Statement From Permissions in paper
- Access: Access through the MetaHQ database at https://doi.org/10.5281/zenodo.17663086.
- Notes: The original web server that housed these annotations (ursa.princeton.edu) is no longer active.
- ⚠️ NonCommercial Restriction
13. URSA_HD¶
- Source: URSA-HD (Human Disease)
- License: Elsevier NonCommercial
- Citation: Lee, Y. et al. A computational framework for genome-wide characterization of the human disease landscape. Cell systems 8, 152–162 (2019).
- DOI: 10.1016/j.cels.2018.12.010
- Rights Statement: From Permissions in paper
- Access: Table S1 in https://www.cell.com/cell-systems/fulltext/S2405-4712(18)30509-X
- ⚠️ NonCommercial Restriction