Neural Functional Transformers

Allan Zhou; Kaien Yang; Yiding Jiang; Kaylee Burns; Winnie Xu; Samuel; Sokota; J. Zico Kolter; Chelsea Finn

arXiv:2305.13546·cs.LG·May 24, 2023·2 cites

Neural Functional Transformers

Allan Zhou, Kaien Yang, Yiding Jiang, Kaylee Burns, Winnie Xu, Samuel, Sokota, J. Zico Kolter, Chelsea Finn

PDF

Open Access

TL;DR

Neural functional Transformers (NFTs) utilize attention mechanisms to process neural network weights, achieving state-of-the-art performance in weight-space tasks and enhancing INR classification accuracy.

Contribution

The paper introduces NFTs, a novel permutation equivariant architecture using attention, and a new INR representation method called Inr2Array, advancing neural functional modeling.

Findings

01

NFTs match or outperform prior weight-space methods.

02

Inr2Array improves INR classification accuracy by up to 17%.

03

NFTs effectively handle high-dimensional weight objects.

Abstract

The recent success of neural networks as implicit representation of data has driven growing interest in neural functionals: models that can process other neural networks as input by operating directly over their weight spaces. Nevertheless, constructing expressive and efficient neural functional architectures that can handle high-dimensional weight-space objects remains challenging. This paper uses the attention mechanism to define a novel set of permutation equivariant weight-space layers and composes them into deep equivariant models called neural functional Transformers (NFTs). NFTs respect weight-space permutation symmetries while incorporating the advantages of attention, which have exhibited remarkable success across multiple domains. In experiments processing the weights of feedforward MLPs and CNNs, we find that NFTs match or exceed the performance of prior weight-space methods.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning in Materials Science · Topic Modeling