SigGate-GT: Taming Over-Smoothing in Graph Transformers via Sigmoid-Gated Attention

Dongxin Guo; Jikun Wu; Siu Ming Yiu

arXiv:2604.17324·cs.LG·April 21, 2026

SigGate-GT: Taming Over-Smoothing in Graph Transformers via Sigmoid-Gated Attention

Dongxin Guo, Jikun Wu, Siu Ming Yiu

PDF

TL;DR

SigGate-GT introduces sigmoid gating in graph transformers to mitigate over-smoothing and attention degeneration, leading to improved performance and training stability on molecular benchmarks.

Contribution

It proposes a novel sigmoid gating mechanism within graph transformers to selectively silence uninformative attention, addressing over-smoothing and attention entropy issues.

Findings

01

Achieves state-of-the-art on ogbg-molhiv (82.47% ROC-AUC).

02

Reduces over-smoothing by 30% across layers.

03

Increases attention entropy and stabilizes training.

Abstract

Graph transformers achieve strong results on molecular and long-range reasoning tasks, yet remain hampered by over-smoothing (the progressive collapse of node representations with depth) and attention entropy degeneration. We observe that these pathologies share a root cause with attention sinks in large language models: softmax attention's sum-to-one constraint forces every node to attend somewhere, even when no informative signal exists. Motivated by recent findings that element-wise sigmoid gating eliminates attention sinks in large language models, we propose SigGate-GT, a graph transformer that applies learned, per-head sigmoid gates to the attention output within the GraphGPS framework. Each gate can suppress activations toward zero, enabling heads to selectively silence uninformative connections. On five standard benchmarks, SigGate-GT matches the prior best on ZINC (0.059 MAE)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.