Delayed Bottlenecking: Alleviating Forgetting in Pre-trained Graph   Neural Networks

Zhe Zhao; Pengkun Wang; Xu Wang; Haibin Wen; Xiaolong Xie; Zhengyang; Zhou; Qingfu Zhang; Yang Wang

arXiv:2404.14941·cs.LG·April 24, 2024

Delayed Bottlenecking: Alleviating Forgetting in Pre-trained Graph Neural Networks

Zhe Zhao, Pengkun Wang, Xu Wang, Haibin Wen, Xiaolong Xie, Zhengyang, Zhou, Qingfu Zhang, Yang Wang

PDF

Open Access

TL;DR

This paper introduces a novel pre-training framework for graph neural networks that delays information compression until fine-tuning, aiming to reduce forgetting and improve transferability of learned representations.

Contribution

The proposed Delayed Bottlenecking Pre-training (DBP) framework maintains mutual information during pre-training and delays compression to fine-tuning, addressing information loss issues in traditional GNN pre-training.

Findings

01

DBP improves transfer performance on chemistry and biology datasets.

02

Delaying compression reduces forgetting in pre-trained GNNs.

03

The framework effectively balances information retention and task-specific adaptation.

Abstract

Pre-training GNNs to extract transferable knowledge and apply it to downstream tasks has become the de facto standard of graph representation learning. Recent works focused on designing self-supervised pre-training tasks to extract useful and universal transferable knowledge from large-scale unlabeled data. However, they have to face an inevitable question: traditional pre-training strategies that aim at extracting useful information about pre-training tasks, may not extract all useful information about the downstream task. In this paper, we reexamine the pre-training process within traditional pre-training and fine-tuning frameworks from the perspective of Information Bottleneck (IB) and confirm that the forgetting phenomenon in pre-training phase may cause detrimental effects on downstream tasks. Therefore, we propose a novel \underline{D}elayed \underline{B}ottlenecking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications