# A network-guided penalized regression with application to proteomics data

**Authors:** Seungjun Ahn, Eun Jeong Oh

PMC · DOI: 10.1093/bioadv/vbag038 · Bioinformatics Advances · 2026-02-03

## TL;DR

This paper introduces a new statistical method that uses protein networks to identify important proteins for predicting disease outcomes.

## Contribution

The novel contribution is a network-guided penalized regression method that integrates protein networks with clinical data for biomarker discovery.

## Key findings

- The proposed method outperforms existing methods in simulations and identifies hub proteins as potential biomarkers.
- Application to CPTAC data reveals hub proteins linked to rare genetic disorders and cancer immunotherapy.
- The method demonstrates variable selection consistency and asymptotic normality.

## Abstract

Network theory has proven invaluable in unraveling complex protein interactions. Previous studies have employed statistical methods rooted in network theory, including the Gaussian graphical model, to infer networks among proteins, identifying hub proteins based on key structural properties of networks such as degree centrality. However, there has been limited research examining a prognostic role of hub proteins on outcomes, while adjusting for clinical covariates in the context of high-dimensional data.

To address this gap, we propose a network-guided penalized regression method. First, we construct a network using the Gaussian graphical model to identify hub proteins. Next, we preserve these identified hub proteins along with clinically relevant factors, while applying adaptive Lasso to non-hub proteins for variable selection. Our network-guided estimators are shown to have variable selection consistency and asymptotic normality. Simulation results suggest that our method produces better results compared to existing methods and demonstrates promise for advancing biomarker identification in proteomics research. Lastly, we apply our method to the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data and identified hub proteins that may serve as prognostic biomarkers for various diseases, including rare genetic disorders and immune checkpoint for cancer immunotherapy.

R package is freely available on CRAN repository (https://CRAN.R-project.org/package=NetGreg) and published under General Public License version 3.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** PLAUR (plasminogen activator, urokinase receptor) [NCBI Gene 5329] {aka CD87, U-PAR, UPAR, URKR}, CD44 (CD44 molecule (IN blood group)) [NCBI Gene 960] {aka CDW44, CSPG8, ECM-III, ECMR-III, H-CAM, HCELL}, PABPC1 (poly(A) binding protein cytoplasmic 1) [NCBI Gene 26986] {aka PAB1, PABP, PABP1, PABPC2, PABPL1}, LGALS1 (galectin 1) [NCBI Gene 3956] {aka GAL1, GBP}, GIMAP7 (GTPase, IMAP family member 7) [NCBI Gene 168537] {aka IAN7, hIAN7}, PRDX1 (peroxiredoxin 1) [NCBI Gene 5052] {aka MSP23, NKEF-A, NKEFA, PAG, PAGA, PAGB}
- **Diseases:** alcohol- (MESH:D000437), viral diseases (MESH:D014777), liver diseases (MESH:D008107), osteosarcoma (MESH:D012516), cardiovascular diseases (MESH:D002318), lung cancer (MESH:D008175), Alzheimer's disease (MESH:D000544), CPTAC (MESH:D009369), BLNK (MESH:D000361), HNSCC (MESH:D000077195), rift valley fever (MESH:D012295), multiple sclerosis (MESH:D009103), genetic disorders (MESH:D030342), corneal ulcer (MESH:D003320)
- **Chemicals:** CPTAC (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Dengue virus (no rank) [taxon 12637]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12949433/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12949433/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12949433/full.md

---
Source: https://tomesphere.com/paper/PMC12949433