Hierarchical Graph Transformer with Adaptive Node Sampling

Zaixi Zhang; Qi Liu; Qingyong Hu; Chee-Kong Lee

arXiv:2210.03930·cs.LG·October 11, 2022·30 cites

Hierarchical Graph Transformer with Adaptive Node Sampling

Zaixi Zhang, Qi Liu, Qingyong Hu, Chee-Kong Lee

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a hierarchical graph transformer with adaptive node sampling that improves performance on large graphs by capturing long-range dependencies and optimizing sampling strategies through an adversary bandit formulation.

Contribution

It proposes a novel hierarchical attention scheme with graph coarsening and adaptive node sampling formulated as an adversary bandit problem, enhancing graph transformer performance.

Findings

01

Outperforms existing graph transformers on real-world datasets

02

Effectively captures long-range dependencies in large graphs

03

Reduces computational complexity with hierarchical attention

Abstract

The Transformer architecture has achieved remarkable success in a number of domains including natural language processing and computer vision. However, when it comes to graph-structured data, transformers have not achieved competitive performance, especially on large graphs. In this paper, we identify the main deficiencies of current graph transformers:(1) Existing node sampling strategies in Graph Transformers are agnostic to the graph characteristics and the training process. (2) Most sampling strategies only focus on local neighbors and neglect the long-range dependencies in the graph. We conduct experimental investigations on synthetic datasets to show that existing sampling strategies are sub-optimal. To tackle the aforementioned problems, we formulate the optimization strategies of node sampling in Graph Transformer as an adversary bandit problem, where the rewards are related to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zaixizhang/ans-gt
pytorchOfficial

Videos

Hierarchical Graph Transformer with Adaptive Node Sampling· slideslive

Taxonomy

TopicsAdvanced Graph Neural Networks · Online Learning and Analytics

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Laplacian EigenMap · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Laplacian Positional Encodings · Softmax · Label Smoothing