QuadraNet V2: Efficient and Sustainable Training of High-Order Neural   Networks with Quadratic Adaptation

Chenhui Xu; Xinyao Wang; Fuxun Yu; Jinjun Xiong; Xiang Chen

arXiv:2405.03192·cs.LG·May 10, 2024

QuadraNet V2: Efficient and Sustainable Training of High-Order Neural Networks with Quadratic Adaptation

Chenhui Xu, Xinyao Wang, Fuxun Yu, Jinjun Xiong, Xiang Chen

PDF

Open Access

TL;DR

QuadraNet V2 introduces a quadratic neural network framework that efficiently leverages pre-trained weights to significantly reduce training time while enhancing the modeling of data non-linearity and shifts.

Contribution

It presents a novel quadratic neural network architecture that combines pre-trained primary terms with quadratic adaptation, improving efficiency and modeling capacity.

Findings

01

Reduces GPU hours by up to 98.4% compared to training from scratch.

02

Enhances high-order model capacity with quadratic adaptation.

03

Demonstrates effective transfer learning with pre-trained weights.

Abstract

Machine learning is evolving towards high-order models that necessitate pre-training on extensive datasets, a process associated with significant overheads. Traditional models, despite having pre-trained weights, are becoming obsolete due to architectural differences that obstruct the effective transfer and initialization of these weights. To address these challenges, we introduce a novel framework, QuadraNet V2, which leverages quadratic neural networks to create efficient and sustainable high-order learning models. Our method initializes the primary term of the quadratic neuron using a standard neural network, while the quadratic term is employed to adaptively enhance the learning of data non-linearity or shifts. This integration of pre-trained primary terms with quadratic terms, which possess advanced modeling capabilities, significantly augments the information characterization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications