NAR-Former V2: Rethinking Transformer for Universal Neural Network   Representation Learning

Yun Yi; Haokui Zhang; Rong Xiao; Nannan Wang; Xiaoyu Wang

arXiv:2306.10792·cs.LG·October 17, 2023·1 cites

NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning

Yun Yi, Haokui Zhang, Rong Xiao, Nannan Wang, Xiaoyu Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces NAR-Former V2, a Transformer-based model that effectively learns neural network representations for both cell-structured and entire networks, outperforming GNN-based methods in latency estimation and matching state-of-the-art accuracy predictions.

Contribution

The paper proposes a novel Transformer-based model that incorporates GNN capabilities for universal neural network representation learning, improving generalization and performance over existing methods.

Findings

01

Surpasses GNN-based NNLP in latency estimation on NNLQP dataset.

02

Achieves comparable accuracy prediction results on NASBench101 and NASBench201 datasets.

03

Enhances Transformer with graph encoding and inductive learning for better generalization.

Abstract

As more deep learning models are being applied in real-world applications, there is a growing need for modeling and learning the representations of neural networks themselves. An efficient representation can be used to predict target attributes of networks without the need for actual training and deployment procedures, facilitating efficient network deployment and design. Recently, inspired by the success of Transformer, some Transformer-based representation learning frameworks have been proposed and achieved promising performance in handling cell-structured models. However, graph neural network (GNN) based approaches still dominate the field of learning representation for the entire network. In this paper, we revisit Transformer and compare it with GNN to analyse their different architecture characteristics. We then propose a modified Transformer-based universal neural network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuny220/NAR-Former-V2
pytorchOfficial

Videos

NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning· slideslive

Taxonomy

TopicsAdvanced Graph Neural Networks · Brain Tumor Detection and Classification · Machine Learning and ELM

MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Dropout · Byte Pair Encoding · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer · Absolute Position Encodings