Unit-Consistent (UC) Adjoint for GSD and Backprop in Deep Learning Applications

Jeffrey Uhlmann

arXiv:2601.10873·cs.LG·January 19, 2026

Unit-Consistent (UC) Adjoint for GSD and Backprop in Deep Learning Applications

Jeffrey Uhlmann

PDF

Open Access

TL;DR

This paper introduces a novel unit-consistent adjoint method for backpropagation in deep neural networks, ensuring gauge invariance and improving optimization consistency across parameterizations.

Contribution

It proposes a new operator-level UC adjoint for backpropagation, enhancing gauge invariance in deep learning optimization.

Findings

01

UC adjoint improves parameterization invariance

02

Method applies uniformly across network components

03

Enhances stability of gradient-based optimization

Abstract

Deep neural networks constructed from linear maps and positively homogeneous nonlinearities (e.g., ReLU) possess a fundamental gauge symmetry: the network function is invariant to node-wise diagonal rescalings. However, standard gradient descent is not equivariant to this symmetry, causing optimization trajectories to depend heavily on arbitrary parameterizations. Prior work has proposed rescaling-invariant optimization schemes for positively homogeneous networks (e.g., path-based or path-space updates). Our contribution is complementary: we formulate the invariance requirement at the level of the backward adjoint/optimization geometry, which provides a simple, operator-level recipe that can be applied uniformly across network components and optimizer state. By replacing the Euclidean transpose with a Unit-Consistent (UC) adjoint, we derive UC gauge-consistent steepest descent and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Model Reduction and Neural Networks