CORESH: a gene signature-based search engine for public gene expression datasets
Vladimir Sukhov, Aigul Nugmanova, Yury Vorontsov, Parul Mehrotra, Maksim Kleverov, Kodi Ravichandran, Maxim Artyomov, Alexey Sergushichev

TL;DR
CORESH is a tool that helps researchers find public gene expression datasets that match their own gene activity patterns, without relying on keywords.
Contribution
CORESH introduces a data-driven search engine for public gene expression datasets based on gene signatures rather than keywords.
Findings
CORESH systematically identifies GEO datasets matching user-provided gene signatures.
The tool uses expression patterns to rank datasets, enabling insights into biological mechanisms.
CORESH is freely accessible and regularly updated with new GEO data.
Abstract
Public data repositories like Gene Expression Omnibus (GEO) contain an extensive amount of data from hundreds of thousands of experiments, making them a valuable resource for researchers. A common scenario for utilizing this resource is to show transcriptional similarity of one’s own data to a public dataset as evidence of potentially similar biology. However, when searching for such datasets, researchers are usually limited to keyword-based search, which requires having a specific hypothesis and relies on the presence of high-quality metadata in public datasets. Here, we introduce CORESH, a web server designed to systematically find GEO datasets that match a user-provided gene signature—such as a list of top upregulated genes in response to a treatment—in a data-driven manner. CORESH operates on a compendium of >40 000 human and 40 000 mouse datasets and outputs a ranked list of…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Molecular Biology Techniques and Applications
