Generalized Optimization: A First Step Towards Category Theoretic Learning Theory
Dan Shiebler

TL;DR
This paper introduces a categorical framework for optimization algorithms using the Cartesian reverse derivative, generalizing gradient descent and Newton's method, and analyzes their invariance and convergence properties.
Contribution
It presents a novel categorical approach to generalize and analyze optimization algorithms, including new invariance and convergence insights.
Findings
Generalized Newton's method is invariant to all invertible linear transformations.
Generalized gradient descent is invariant only to orthogonal transformations.
The change in loss for generalized gradient descent can be expressed with an inner product-like structure.
Abstract
The Cartesian reverse derivative is a categorical generalization of reverse-mode automatic differentiation. We use this operator to generalize several optimization algorithms, including a straightforward generalization of gradient descent and a novel generalization of Newton's method. We then explore which properties of these algorithms are preserved in this generalized setting. First, we show that the transformation invariances of these algorithms are preserved: while generalized Newton's method is invariant to all invertible linear transformations, generalized gradient descent is invariant only to orthogonal linear transformations. Next, we show that we can express the change in loss of generalized gradient descent with an inner product-like expression, thereby generalizing the non-increasing and convergence properties of the gradient descent optimization flow. Finally, we include…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Photonic and Optical Devices · Iterative Methods for Nonlinear Equations
