# Uniform design-embedded predictions of (tetra-)peptide physicochemical properties

**Authors:** Zhihui Zhu, Huapeng Liu, Xuechen Li, Haojin Zhou, Jiaqi Wang

PMC · DOI: 10.1093/bioinformatics/btag036 · 2026-01-19

## TL;DR

This paper introduces a new method combining uniform design and AI to predict the properties of tetrapeptides, enabling better design of functional peptides for drug discovery and materials science.

## Contribution

A novel integration of uniform design and AI for unbiased prediction of tetrapeptide physicochemical properties and their self-assembly behaviors.

## Key findings

- Uniform design generates 31 unbiased peptide datasets for AI training.
- AI models accurately predict aggregation propensity, hydrophilicity, and isoelectric point of tetrapeptides.
- SHAP analysis reveals relationships between physicochemical properties and self-assembly behaviors.

## Abstract

Short peptides hold significant promise in drug discovery and materials science due to their biocompatibility, multifunctionality, ease of synthesis, etc. However, accurately predicting their physicochemical properties, a prerequisite for application development, remains a grand challenge due to the sheet quantity of peptides.

This study presents an innovative approach integrating uniform design (UD) on the sampling over the whole space with artificial intelligence (AI) on the sampled data to enhance prediction of key physicochemical properties, including aggregation propensity (AP), hydrophilicity (logP), and isoelectric point (pI), within the complete sequence space of tetrapeptides (160 000 sequences). Using UD, we generate 31 distinct peptide datasets, with a consistent amino acid occupation fraction of 5% at each position, thereby creating unbiased training data without any amino acid preferences for training AI models. This work provides comprehensive datasets on the physicochemical properties of all tetrapeptides, develops robust AI-based predictive models, and quantitatively elucidates the relationships between key physicochemical attributes and self-assembly behaviors of short peptides by Shapley Additive Explanations (SHAP) analysis. By integrating the strategic experimental design (i.e. UD), AI modeling, and peptide domain knowledge, our approach facilitates the discovery and optimization of functional peptides, offering new opportunities for peptide-based therapeutic applications.

The complete datasets, source code, and pretrained models are made available at the Github repository (https://github.com/JiaqiBenWang/UD-AI-Peptide) and Zenodo (https://doi.org/10.5281/zenodo.17984124).

## Full-text entities

- **Chemicals:** Peptide (MESH:D010455)

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13032896/full.md

---
Source: https://tomesphere.com/paper/PMC13032896