# X-CRISP: domain-adaptable and interpretable CRISPR repair outcome prediction

**Authors:** Colm Seale, Joana P Gonçalves

PMC · DOI: 10.1093/bioadv/vbaf157 · Bioinformatics Advances · 2025-07-02

## TL;DR

X-CRISP is a machine learning model that predicts CRISPR repair outcomes and adapts well to new cell lines with minimal data.

## Contribution

X-CRISP introduces a domain-adaptable and interpretable model for CRISPR repair prediction using transfer learning.

## Key findings

- X-CRISP outperforms prior models in predicting detailed and aggregate CRISPR repair outcomes.
- X-CRISP prioritizes microhomology location over sequence properties for deletion predictions.
- Transfer learning enables adaptation to new cell lines with as few as 50 training samples.

## Abstract

Controlling the outcomes of CRISPR editing is crucial for the success of gene therapy. Since donor template-based editing is often inefficient, alternative strategies have emerged that leverage mutagenic end-joining repair instead. Existing machine learning models can accurately predict end-joining repair outcomes; however, generalisability beyond the specific cell line used for training remains a challenge, and interpretability is typically limited by suboptimal feature representation and model architecture.

We propose X-CRISP, a flexible and interpretable neural network for predicting repair outcome frequencies based on a minimal set of outcome and sequence features, including microhomologies (MH). Outperforming prior models on detailed and aggregate outcome predictions, X-CRISP prioritised MH location over MH sequence properties such as GC content for deletion outcomes. Through transfer learning, we adapted X-CRISP pre-trained on wild-type mESC data to target human cell lines K562, HAP1, U2OS, and mESC lines with altered DNA repair function. Adapted X-CRISP models improved over direct training on target data from as few as 50 samples, suggesting that this strategy could be leveraged to build models for new domains using a fraction of the data required to train models from scratch.

X-CRISP is available at https://github.com/joanagoncalveslab/xcrisp.

## Linked entities

- **Species:** Homo sapiens (taxon 9606)

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** U2OS — Homo sapiens (Human), Osteosarcoma, Cancer cell line (CVCL_0042), mESC — Mus musculus (Mouse), Embryonic stem cell (CVCL_4378), HAP1 — Homo sapiens (Human), Chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_Y019), K562 — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_0004)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12270252/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12270252/full.md

## References

33 references — full list in the complete paper: https://tomesphere.com/paper/PMC12270252/full.md

---
Source: https://tomesphere.com/paper/PMC12270252