# SimPep and OP-AND: A deep learning framework and curated database for predicting osteogenic peptides

**Authors:** Maryam Ghobakhloo, Zahra Ghorbanali, Fatemeh Zare-Mirakabad, Roya Abbaszadeh, Mohammad Taheri-Ledari, Bahman Zeynali

PMC · DOI: 10.1371/journal.pcbi.1013422 · PLOS Computational Biology · 2025-08-29

## TL;DR

Researchers created a database and deep learning model to identify peptides that promote bone growth, which could help prevent bone diseases.

## Contribution

The first public database (OP-AND) and deep learning framework (SimPep) for predicting osteogenic peptides.

## Key findings

- SimPep achieved 86.87% accuracy in predicting osteogenic peptides.
- A camel milk peptide was identified as a top candidate for promoting bone growth.

## Abstract

Bone health is a growing concern in aging populations, and bioactive peptides in dairy products offer a promising approach to preventing bone-related diseases. However, the lack of a public database for osteogenic peptides (OPs) has limited the computational detection efforts. In this work, we introduce OP-AND, a curated public database of osteogenic peptides. We also propose a novel hypothesis that peptides derived from proteins involved in osteoclast formation may serve as non-osteogenic. Considering the limited availability of OP data, we present SimPep, a deep learning framework that achieves 86.87% accuracy and 76.88% area under receiver-operating characteristic curve score using five-fold cross-validation. SimPep’s performance is further evaluated on external datasets, and a pipeline is introduced to select potential OPs for experimental studies. The camel milk alpha s1-casein peptide ‘MKLLILTCLVAVALARPKYPLRYPEVF’ is highlighted as a top candidate for future exploration. The OP-AND database is available in https://github.com/CBRC-lab/SimPep_and_OP-AND.

Certain small protein fragments, called peptides, found in dairy products have shown potential to support bone growth and prevent diseases such as osteoporosis. However, researchers currently lack a dedicated and organized database to study these bone-strengthening peptides computationally. In this work, we introduce OP-AND, the first publicly available database focused on peptides with bone-forming potential. To facilitate peptide discovery, we also develop a deep learning model named SimPep, designed to predict whether a given peptide exhibits osteogenic properties. To our knowledge, this represents the first comprehensive effort in this area and lays the foundation for future research in computational osteogenic peptide discovery. Our model demonstrates strong performance across various experiments and helps identify promising candidates, such as a peptide derived from camel milk, for further laboratory testing. To support the broader scientific community, we make both the OP-AND database and the SimPep model publicly accessible on GitHub. Additionally, we provide a companion tool, SimPep-App, which enables osteogenicity analysis of peptides.

## Linked entities

- **Diseases:** osteoporosis (MONDO:0005298)

## Full-text entities

- **Diseases:** bone-related diseases (MESH:D001847)
- **Chemicals:** OP (-), AND (MESH:C019152)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12611171/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12611171/full.md

## References

62 references — full list in the complete paper: https://tomesphere.com/paper/PMC12611171/full.md

---
Source: https://tomesphere.com/paper/PMC12611171