Simple Linear Neuron Boosting

Daniel Munoz

arXiv:2502.01131·cs.LG·February 4, 2025

Simple Linear Neuron Boosting

Daniel Munoz

PDF

Open Access

TL;DR

This paper introduces a novel neuron optimization method called Boosted Backpropagation, which reparameterizes networks for faster convergence and is applicable to various architectures like CNNs and transformers.

Contribution

It presents a new online, matrix-free algorithm for neuron optimization in function space, improving training speed and convergence across different network types.

Findings

01

Fast convergence in epochs and wall clock time

02

Applicable to convolutional networks and transformers

03

Simple implementation for local and distributed training

Abstract

Given a differentiable network architecture and loss function, we revisit optimizing the network's neurons in function space using Boosted Backpropagation (Grubb & Bagnell, 2010), in contrast to optimizing in parameter space. From this perspective, we reduce descent in the space of linear functions that optimizes the network's backpropagated-errors to a preconditioned gradient descent algorithm. We show that this preconditioned update rule is equivalent to reparameterizing the network to whiten each neuron's features, with the benefit that the normalization occurs outside of inference. In practice, we use this equivalence to construct an online estimator for approximating the preconditioner and we propose an online, matrix-free learning algorithm with adaptive step sizes. The algorithm is applicable whenever autodifferentiation is available, including convolutional networks and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications