Distance-Misaligned Training in Graph Transformers and Adaptive Graph-Aware Control
Qinhan Hou, Jing Tang

TL;DR
This paper investigates how Graph Transformers' ability to communicate over different distances affects their performance on tasks with varying locality, proposing adaptive control methods to optimize their communication bias.
Contribution
It introduces a synthetic benchmark and adaptive control strategies to align Graph Transformer communication with task-specific locality requirements.
Findings
Adaptive controllers improve performance on mixed and local tasks.
Preferred graph-distance bias varies systematically with task locality.
Task-agnostic controllers are less effective than task-aware ones.
Abstract
Graph Transformers can mix information globally, but this flexibility also creates failure modes: some tasks require long-range communication while others are better served by local interaction. We study this through a synthetic node-classification benchmark on contextual stochastic block model graphs, where labels are generated by a controllable mixture of local and far-shell signals. We define distance-misaligned training as a mismatch between where label-relevant information lies and where the model allocates communication over graph distance. On this benchmark, we find three points. First, the preferred graph-distance bias changes systematically with task locality. Second, an oracle adaptive controller, given offline access to the task-side distance target, nearly matches the best fixed bias across regimes and strongly improves over a neutral baseline on mixed and local tasks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
