Get rid of your constraints and reparametrize: A study in NNLS and   implicit bias

Hung-Hsu Chou; Johannes Maly; Claudio Mayrink Verdun; Bernardo Freitas; Paulo da Costa; Heudson Mirandola

arXiv:2207.08437·math.OC·March 11, 2025·1 cites

Get rid of your constraints and reparametrize: A study in NNLS and implicit bias

Hung-Hsu Chou, Johannes Maly, Claudio Mayrink Verdun, Bernardo Freitas, Paulo da Costa, Heudson Mirandola

PDF

Open Access

TL;DR

This paper explores reparametrization and Riemannian optimization techniques to solve non-negative least squares efficiently, demonstrating global convergence, accelerated methods, and stability, with implications for understanding implicit bias in neural networks.

Contribution

It introduces a novel reparametrization approach connecting gradient descent to Riemannian optimization for NNLS, achieving global convergence and accelerated methods without geodesic calculations.

Findings

01

Global convergence of gradient flow on reparametrized objectives

02

Accelerated convergence using second-order ODEs

03

Stability against negative perturbations

Abstract

Over the past years, there has been significant interest in understanding the implicit bias of gradient descent optimization and its connection to the generalization properties of overparametrized neural networks. Several works observed that when training linear diagonal networks on the square loss for regression tasks (which corresponds to overparametrized linear regression) gradient descent converges to special solutions, e.g., non-negative ones. We connect this observation to Riemannian optimization and view overparametrized GD with identical initialization as a Riemannian GD. We use this fact for solving non-negative least squares (NNLS), an important problem behind many techniques, e.g., non-negative matrix factorization. We show that gradient flow on the reparametrized objective converges globally to NNLS solutions, providing convergence rates also for its discretized counterpart.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Medical Image Segmentation Techniques