An alternative approach to train neural networks using monotone   variational inequality

Chen Xu; Xiuyuan Cheng; Yao Xie

arXiv:2202.08876·stat.ML·March 13, 2024

An alternative approach to train neural networks using monotone variational inequality

Chen Xu, Xiuyuan Cheng, Yao Xie

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel neural network training method based on monotone variational inequalities, offering efficient convergence and improved performance for specific tasks like fine-tuning pre-trained models.

Contribution

It develops a new approach using monotone vector fields for neural network training, extending previous work to practical deep learning scenarios.

Findings

01

Efficient convergence in training neural networks.

02

Competitive or superior performance compared to SGD.

03

Applicable to various neural network architectures.

Abstract

We propose an alternative approach to neural network training using the monotone vector field, an idea inspired by the seminal work of Juditsky and Nemirovski [Juditsky & Nemirovsky, 2019] developed originally to solve parameter estimation problems for generalized linear models (GLM) by reducing the original non-convex problem to a convex problem of solving a monotone variational inequality (VI). Our approach leads to computationally efficient procedures that converge fast and offer guarantee in some special cases, such as training a single-layer neural network or fine-tuning the last layer of the pre-trained model. Our approach can be used for more efficient fine-tuning of a pre-trained model while freezing the bottom layers, an essential step for deploying many machine learning models such as large language models (LLM). We demonstrate its applicability in training fully-connected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hamrel-cxu/svi-nn-training
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Stochastic Gradient Optimization Techniques