Towards a deeper GCN: Alleviate over-smoothing with iterative training and fine-tuning
Furong Peng, Jinzhen Gao, Xuan Lu, Kang Liu, Yifan Huo, Sheng Wang

TL;DR
This paper identifies over-smoothing in deep GCNs as partly caused by linear transformations and proposes Layer-wise Gradual Training (LGT), a novel method that enables training very deep GCNs with improved performance and stability.
Contribution
It introduces LGT, a new training strategy that stabilizes deep GCN training by layer-wise optimization, low-rank fine-tuning, and identity initialization, addressing over-smoothing.
Findings
LGT achieves state-of-the-art accuracy on deep GCNs up to 32 layers.
LGT improves performance when combined with existing normalization methods.
Deep GCNs trained with LGT outperform traditional training methods.
Abstract
Graph Convolutional Networks (GCNs) suffer from severe performance degradation in deep architectures due to over-smoothing. While existing studies primarily attribute the over-smoothing to repeated applications of graph Laplacian operators, our empirical analysis reveals a critical yet overlooked factor: trainable linear transformations in GCNs significantly exacerbate feature collapse, even at moderate depths (e.g., 8 layers). In contrast, Simplified Graph Convolution (SGC), which removes these transformations, maintains stable feature diversity up to 32 layers, highlighting linear transformations' dual role in facilitating expressive power and inducing over-smoothing. However, completely removing linear transformations weakens the model's expressive capacity. To address this trade-off, we propose Layer-wise Gradual Training (LGT), a novel training strategy that progressively builds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Brain Tumor Detection and Classification
