Categorical Foundations of Gradient-Based Learning
G.S.H. Cruttwell, Bruno Gavranovi\'c, Neil Ghani, Paul Wilson, Fabio, Zanasi

TL;DR
This paper introduces a categorical framework for understanding gradient-based learning algorithms, unifying various methods and loss functions, and demonstrating its applicability in both continuous and discrete settings with a Python implementation.
Contribution
It provides a novel categorical semantics for gradient algorithms, encompassing multiple methods and loss functions, and extends to discrete domains with practical implementation.
Findings
Unified categorical framework for gradient algorithms
Applicability to continuous and discrete domains
Python implementation demonstrating practical relevance
Abstract
We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realized in the discrete setting of boolean circuits. Finally, we demonstrate the practical significance of our framework with an implementation in Python.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Neural Networks and Applications · Ferroelectric and Negative Capacitance Devices
MethodsSoftmax · AdaGrad
