An alternative approach to train neural networks using monotone variational inequality
Chen Xu, Xiuyuan Cheng, Yao Xie

TL;DR
This paper introduces a novel neural network training method based on monotone variational inequalities, offering efficient convergence and improved performance for specific tasks like fine-tuning pre-trained models.
Contribution
It develops a new approach using monotone vector fields for neural network training, extending previous work to practical deep learning scenarios.
Findings
Efficient convergence in training neural networks.
Competitive or superior performance compared to SGD.
Applicable to various neural network architectures.
Abstract
We propose an alternative approach to neural network training using the monotone vector field, an idea inspired by the seminal work of Juditsky and Nemirovski [Juditsky & Nemirovsky, 2019] developed originally to solve parameter estimation problems for generalized linear models (GLM) by reducing the original non-convex problem to a convex problem of solving a monotone variational inequality (VI). Our approach leads to computationally efficient procedures that converge fast and offer guarantee in some special cases, such as training a single-layer neural network or fine-tuning the last layer of the pre-trained model. Our approach can be used for more efficient fine-tuning of a pre-trained model while freezing the bottom layers, an essential step for deploying many machine learning models such as large language models (LLM). We demonstrate its applicability in training fully-connected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Stochastic Gradient Optimization Techniques
