Towards a deeper GCN: Alleviate over-smoothing with iterative training and fine-tuning

Furong Peng; Jinzhen Gao; Xuan Lu; Kang Liu; Yifan Huo; Sheng Wang

arXiv:2506.17576·cs.LG·July 23, 2025

Towards a deeper GCN: Alleviate over-smoothing with iterative training and fine-tuning

Furong Peng, Jinzhen Gao, Xuan Lu, Kang Liu, Yifan Huo, Sheng Wang

PDF

Open Access

TL;DR

This paper identifies over-smoothing in deep GCNs as partly caused by linear transformations and proposes Layer-wise Gradual Training (LGT), a novel method that enables training very deep GCNs with improved performance and stability.

Contribution

It introduces LGT, a new training strategy that stabilizes deep GCN training by layer-wise optimization, low-rank fine-tuning, and identity initialization, addressing over-smoothing.

Findings

01

LGT achieves state-of-the-art accuracy on deep GCNs up to 32 layers.

02

LGT improves performance when combined with existing normalization methods.

03

Deep GCNs trained with LGT outperform traditional training methods.

Abstract

Graph Convolutional Networks (GCNs) suffer from severe performance degradation in deep architectures due to over-smoothing. While existing studies primarily attribute the over-smoothing to repeated applications of graph Laplacian operators, our empirical analysis reveals a critical yet overlooked factor: trainable linear transformations in GCNs significantly exacerbate feature collapse, even at moderate depths (e.g., 8 layers). In contrast, Simplified Graph Convolution (SGC), which removes these transformations, maintains stable feature diversity up to 32 layers, highlighting linear transformations' dual role in facilitating expressive power and inducing over-smoothing. However, completely removing linear transformations weakens the model's expressive capacity. To address this trade-off, we propose Layer-wise Gradual Training (LGT), a novel training strategy that progressively builds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Brain Tumor Detection and Classification