Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning
Julien Mairal (INRIA Grenoble Rh\^one-Alpes / LJK Laboratoire Jean, Kuntzmann)

TL;DR
This paper introduces an incremental majorization-minimization algorithm tailored for large-scale machine learning, providing convergence guarantees and demonstrating competitive performance on logistic regression and sparse estimation tasks.
Contribution
It develops a novel incremental MM scheme with convergence guarantees for non-convex and convex problems, applicable to large-scale machine learning.
Findings
Convergence guarantees for non-convex and convex optimization.
Linear convergence rate for strongly convex functions.
Competitive performance on logistic regression and sparse estimation.
Abstract
Majorization-minimization algorithms consist of successively minimizing a sequence of upper bounds of the objective function. These upper bounds are tight at the current estimate, and each iteration monotonically drives the objective function downhill. Such a simple principle is widely applicable and has been very popular in various scientific fields, especially in signal processing and statistics. In this paper, we propose an incremental majorization-minimization scheme for minimizing a large sum of continuous functions, a problem of utmost importance in machine learning. We present convergence guarantees for non-convex and convex optimization when the upper bounds approximate the objective up to a smooth error; we call such upper bounds "first-order surrogate functions". More precisely, we study asymptotic stationary point guarantees for non-convex problems, and for convex ones, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Distributed Sensor Networks and Detection Algorithms
MethodsLogistic Regression
