# MetaPepticon: automated prediction of anticancer peptides from microbial genomes and metagenomes

**Authors:** Ahmet Arıhan Erözden, Nalan Tavşanlı, Gamze Demirel, Nazmiye Ozlem Sanli, Mahmut Çalışkan, Muzaffer Arıkan

PMC · DOI: 10.7717/peerj.20990 · PeerJ · 2026-03-27

## TL;DR

MetaPepticon is a new automated tool that identifies anticancer peptides from large-scale microbial sequencing data, making the discovery process faster and more reliable.

## Contribution

MetaPepticon introduces an end-to-end pipeline for scalable, reproducible ACP prediction from diverse sequencing inputs without manual preprocessing.

## Key findings

- MetaPepticon identified 10,725 ACP candidates with moderate agreement from 41,171 microbial genomes and 4,072,884 peptides.
- Among the candidates, 4,590 were novel and non-toxic, expanding the pool of potential anticancer peptides.
- The tool integrates multiple predictive algorithms and automates quality control and filtering for high-throughput data.

## Abstract

Anticancer peptides (ACPs) are increasingly recognized as promising therapeutic candidates due to their ability to selectively target cancer cells. However, the systematic discovery of novel ACPs, particularly from high-throughput sequencing datasets, remains hindered by technical and methodological limitations. Current prediction frameworks require pre-extracted peptide sequences, involve manual preprocessing, and yield variable results, which restricts their applicability for large-scale, data-driven discovery.

To address these limitations, we developed MetaPepticon, a modular, end-to-end pipeline for the discovery of ACP candidates from diverse sequencing inputs, including raw genomic, metagenomic, transcriptomic, and metatranscriptomic reads, as well as assembled contigs and peptide sequences. MetaPepticon automates quality control, filtering, assembly, small open reading frame prediction, ACP classification using multiple predictive algorithms, and in silico toxicity filtering.

MetaPepticon enables scalable and reproducible ACP prediction from raw sequences through integration of multiple predictors within a configurable agreement framework. Applied to 41,171 microbial genomes and 4,072,884 peptides, MetaPepticon identified 10,725 moderate-agreement ACP candidates, including 4,590 novel, non-toxic peptides. MetaPepticon expands the practical applicability of existing ACP prediction methods to high-throughput sequencing data and is freely available at: https://github.com/arikanlab/MetaPepticon.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** CPAT1 (cerebral palsy, ataxic 1) [NCBI Gene 60502] {aka ACP}
- **Diseases:** cancer (MESH:D009369), toxicity (MESH:D064420)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13034871/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13034871/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC13034871/full.md

---
Source: https://tomesphere.com/paper/PMC13034871