# CORESH: a gene signature-based search engine for public gene expression datasets

**Authors:** Vladimir Sukhov, Aigul Nugmanova, Yury Vorontsov, Parul Mehrotra, Maksim Kleverov, Kodi Ravichandran, Maxim Artyomov, Alexey Sergushichev

PMC · DOI: 10.1093/nar/gkaf372 · 2025-05-05

## TL;DR

CORESH is a tool that helps researchers find public gene expression datasets that match their own gene activity patterns, without relying on keywords.

## Contribution

CORESH introduces a data-driven search engine for public gene expression datasets based on gene signatures rather than keywords.

## Key findings

- CORESH systematically identifies GEO datasets matching user-provided gene signatures.
- The tool uses expression patterns to rank datasets, enabling insights into biological mechanisms.
- CORESH is freely accessible and regularly updated with new GEO data.

## Abstract

Public data repositories like Gene Expression Omnibus (GEO) contain an extensive amount of data from hundreds of thousands of experiments, making them a valuable resource for researchers. A common scenario for utilizing this resource is to show transcriptional similarity of one’s own data to a public dataset as evidence of potentially similar biology. However, when searching for such datasets, researchers are usually limited to keyword-based search, which requires having a specific hypothesis and relies on the presence of high-quality metadata in public datasets. Here, we introduce CORESH, a web server designed to systematically find GEO datasets that match a user-provided gene signature—such as a list of top upregulated genes in response to a treatment—in a data-driven manner. CORESH operates on a compendium of >40 000 human and 40 000 mouse datasets and outputs a ranked list of datasets where the input genes exhibit similar expression patterns. The discovered datasets can then be used to identify experimental conditions associated with the activation of the query signature, offering insights into underlying biological mechanisms and guiding experimental validation. CORESH is freely accessible at https://alserglab.wustl.edu/coresh/, requires no login, and is regularly updated with the latest GEO data.

Graphical Abstract

## Linked entities

- **Species:** Homo sapiens (taxon 9606), Mus musculus (taxon 10090)

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12230675/full.md

---
Source: https://tomesphere.com/paper/PMC12230675