# Prediction of liquid-phase separation proteins using Siamese network with feature fusion

**Authors:** Ye-Hong Yang, Qun Liu, Jiang-Feng Liu, Jun-Tao Yang

PMC · DOI: 10.1093/bib/bbaf393 · Briefings in Bioinformatics · 2025-08-06

## TL;DR

This paper introduces a machine learning framework using a Siamese network to predict proteins involved in liquid-liquid phase separation by integrating protein features and interaction data.

## Contribution

The novel contribution is a feature fusion framework using a Siamese network for LLPS protein prediction with integration of PPI network features.

## Key findings

- The Siamese network with feature fusion achieves good accuracy even with small sample sets.
- Node2vec and DeepNF graph embedding methods were compared for their impact on model performance.
- The framework is flexible for fusing different protein features and applicable to other prediction tasks.

## Abstract

Liquid–liquid phase separation (LLPS) is a common and important phenomenon where biomolecules form dynamic, membrane-less condensates through multivalent interactions, spontaneously separating into distinct concentration-dense and dilute phases. Research has shown that LLPS is associated with a wide range of cellular functional regulation. In this work, we establish a feature fusion framework based on a Siamese network for the prediction of LLPS proteins, which can integrate automatically extracted features from the protein itself and the protein–protein interaction (PPI) networks, and achieve good accuracy even in small sample sets. We used two representative graph embedding methods, Node2vec and DeepNF, to extract the embedding features of PPI networks and compared the impact of the two methods on model performance at different feature lengths. Our work provides a way for integrating multivalent interactions between proteins that drive LLPS, as well as a flexible framework for the fusion of different types of protein features, not only for LLPS prediction but also for other downstream prediction tasks. All relevant materials can be found at https://github.com/ispotato/SiameseNetwork_LLPS.

## Full-text entities

- **Genes:** GSN (gelsolin) [NCBI Gene 2934] {aka ADF, AGEL, AMYLD4}, CBS (cystathionine beta-synthase) [NCBI Gene 875] {aka HIP4}, HSPA4 (heat shock protein family A (Hsp70) member 4) [NCBI Gene 3308] {aka APG-2, HEL-S-5a, HS24/P52, HSPH2, RY, hsp70}, PC (pyruvate carboxylase) [NCBI Gene 5091] {aka PCB}, RPL7 (ribosomal protein L7) [NCBI Gene 6129] {aka L7, humL7-1, uL30}, GRIA2 (glutamate ionotropic receptor AMPA type subunit 2) [NCBI Gene 2891] {aka GLUR2, GLURB, GluA2, GluR-K2, HBGR2, NEDLIB}, TIMP3 (TIMP metallopeptidase inhibitor 3) [NCBI Gene 7078] {aka HSMRK222, K222, K222TA2, SFD}
- **Diseases:** LLPS (MESH:D000210), Chronic Diseases (MESH:D002908), metabolic diseases (MESH:D008659), neurodegenerative diseases (MESH:D019636), cancer (MESH:D009369)
- **Chemicals:** amino acids (MESH:D000596), ESM1280 (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12342145/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12342145/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/PMC12342145/full.md

---
Source: https://tomesphere.com/paper/PMC12342145