On the Equivalence of Graph Convolution and Mixup

Xiaotian Han; Hanqing Zeng; Yu Chen; Shaoliang Nie; Jingzhou Liu,; Kanika Narang; Zahra Shakeri; Karthik Abinav Sankararaman; Song Jiang; Madian; Khabsa; Qifan Wang; Xia Hu

arXiv:2310.00183·cs.LG·September 13, 2024

On the Equivalence of Graph Convolution and Mixup

Xiaotian Han, Hanqing Zeng, Yu Chen, Shaoliang Nie, Jingzhou Liu,, Kanika Narang, Zahra Shakeri, Karthik Abinav Sankararaman, Song Jiang, Madian, Khabsa, Qifan Wang, Xia Hu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper demonstrates that under certain conditions, graph convolution can be mathematically and practically viewed as a form of Mixup data augmentation, unifying two seemingly different techniques in graph neural networks.

Contribution

It establishes a theoretical and empirical connection between graph convolution and Mixup, showing they are equivalent under specific conditions.

Findings

01

Graph convolution can be expressed as a form of Mixup.

02

Under two mild conditions, GCN and SGC are equivalent to Mixup.

03

Empirical results confirm the theoretical equivalence.

Abstract

This paper investigates the relationship between graph convolution and Mixup techniques. Graph convolution in a graph neural network involves aggregating features from neighboring samples to learn representative features for a specific node or sample. On the other hand, Mixup is a data augmentation technique that generates new examples by averaging features and one-hot labels from multiple samples. One commonality between these techniques is their utilization of information from multiple samples to derive feature representation. This study aims to explore whether a connection exists between these two approaches. Our investigation reveals that, under two mild conditions, graph convolution can be viewed as a specialized form of Mixup that is applied during both the training and testing phases. The two conditions are: 1) \textit{Homophily Relabel} - assigning the target node's label to all…

Peer Reviews

Decision·ICLR 2024 Conference Withdrawn Submission

Reviewer 01Rating 3· reject, not good enoughConfidence 4

Strengths

- **Novel and original.** The paper proposes a very interesting interpretation of graph convolution, which to the best of my knowledge, has not been studied yet. - **Great presentation.** Paper is well-written with clear mathematical derivations that are easy to follow.

Weaknesses

- **Overall significance is unclear and claimed practical/theoretical implications lack support.** - [W1] While Section 6 mentions how TMLP and HMLP can be training-efficient with large-scale graphs since they are MLP-based, there are no experiments to compare computational costs and demonstrate this claim. I also suspect the gain in efficiency would be fairly small since graph convolution can be highly optimized via sparse-dense matrix multiplications. - [W2] The theoretical implications al

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

By examining the equations of both MixUp and GCN authors derive the link between the two. In particular a relabeling of the neighboring nodes allow to express GCN's as DNN's + MixUp. I believe this is a pretty interesting conclusion! As mentioned this has the potential to speed up computation / scale of GCN's by dropping (expensive) sparse primitives and replacing them with efficient dense operations. The paper is very extensive and well written with lots of good plots and examples!

Weaknesses

I may have missed something but I believe there is a loss when going from GCN's to DNN+MixUp - loss in structural generalization? (see questions)

Reviewer 03Rating 3· reject, not good enoughConfidence 3

Strengths

The paper is investigating an interesting relationship between graph convolution and Mixup and has several strengths: 1. The paper is very detailed and convincing about the similarities of GNNs and Mixup. 2. The presentation quality is good. 3. The proposed methods are potentially valuable (for example, can be fast to train and evaluate for large graphs).

Weaknesses

While the paper is convincing about the equivalence of graph convolution and Mixup, it's contributions are quite limited. 1. The paper claims potential speed ups (e.g. "this work ... facilitating efficient training and inference for GNNs when dealing with large-scale graph data"), but does not provide any time/memory cost comparisons for training/inference. Therefore, it's hard to be convinced about the benefit of the proposed approaches. For example, see Table 1 in [Huang2021] and results in [

Code & Models

Repositories

ahxt/graphconv_is_mixup
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Brain Tumor Detection and Classification

MethodsGraph Neural Network · Convolution · Mixup