# IdentifiHR predicts homologous recombination deficiency in high-grade serous ovarian carcinoma using gene expression

**Authors:** Ashley L. Weir, Samuel C. Lee, Mengbo Li, Ahwan Pandey, Chin Wee Tan, Dale W. Garsed, Susan J. Ramus, Nadia M. Davidson

PMC · DOI: 10.1038/s43856-026-01387-y · Communications Medicine · 2026-01-14

## TL;DR

IdentifiHR is a new tool that uses gene expression to predict DNA repair defects in ovarian cancer, helping identify patients who may benefit from specific treatments.

## Contribution

IdentifiHR is the first gene expression-based model specifically designed for high-grade serous ovarian carcinoma to predict homologous recombination deficiency.

## Key findings

- IdentifiHR achieves 85% accuracy in predicting HR status using gene expression data from TCGA test sets.
- The model outperforms existing methods like BRCAness and expHRD in predicting HR status in HGSC.
- IdentifiHR is effective on pseudobulked single-cell RNA sequencing data with 84% accuracy.

## Abstract

Approximately half of all high-grade serous ovarian carcinomas (HGSCs) have a therapeutically targetable defect in homologous recombination (HR) DNA repair. While there are genomic and transcriptomic methods, developed for other cancers, to identify HR deficient (HRD) samples, there are no gene expression-based tools to predict HR status in HGSC specifically. We have built a HGSC-specific model to predict HR status using gene expression.

We separated The Cancer Genome Atlas (TCGA) cohort of HGSCs into training (n = 288) and testing (n = 73) sets and labelled each case as HRD or HR proficient (HRP) based on the clinical standard for classification. Using the training set, we performed differential gene expression analysis between HRD and HRP cases. The 2604 significantly differentially expressed genes were used to train a penalised logistic regression model.

IdentifiHR uses the expression of 209 genes to predict HR status in HGSC. These genes preserve the genomic damage signal, capturing known regions of HR-specific copy number alteration which impact gene expression. IdentifiHR is 85% accurate in the TCGA test set and 86% accurate in an independent cohort of 99 samples, taken from primary tumours, ascites and normal fallopian tubes. Further, IdentifiHR is 84% accurate in pseudobulked single-cell HGSC sequencing from 37 patients and outperforms existing expression-based methods to predict HR status, being BRCAness, MutliscaleHRD and expHRD.

IdentifiHR is an accurate model to predict HR status in HGSC. It is available as an open source R package, empowering researchers to robustly classify HR status when only transcriptomic sequencing data is available.

High-grade serous ovarian cancer (HGSC) is a type of ovarian cancer with very poor outcomes. However, half of HGSCs have faulty DNA repair that can be targeted for treatment if it is identified. Existing methods look at changes in DNA that arise when repair is faulty, but do not consider which genes are actively being used, or are “expressed”, by the cancer. We developed IdentifiHR, a machine learning method to predict DNA repair status using the expression of 209 genes. We tested IdentifiHR on 209 patient samples and found it correctly predicts repair status in about 85–86% of cases, performing better than existing tools on the same patient data. IdentifiHR is released as a software package for public use.

Weir et al. present IdentifiHR, a logistic regression model to predict the homologous recombination status of a high-grade serous ovarian carcinoma using the expression of 209 genes. Findings reveal that the IdentifiHR model is accurate and can be applied to bulk and single cell RNA sequencing data.

## Linked entities

- **Diseases:** ovarian cancer (MONDO:0005140)

## Full-text entities

- **Diseases:** ascites (MESH:D001201), HR deficient (MESH:C535296), HGSCs (MESH:D010051), Cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12910048/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12910048/full.md

## References

4 references — full list in the complete paper: https://tomesphere.com/paper/PMC12910048/full.md

---
Source: https://tomesphere.com/paper/PMC12910048