Categorical Foundations of Gradient-Based Learning

G.S.H. Cruttwell; Bruno Gavranovi\'c; Neil Ghani; Paul Wilson; Fabio; Zanasi

arXiv:2103.01931·cs.LG·July 14, 2021·5 cites

Categorical Foundations of Gradient-Based Learning

G.S.H. Cruttwell, Bruno Gavranovi\'c, Neil Ghani, Paul Wilson, Fabio, Zanasi

PDF

Open Access

TL;DR

This paper introduces a categorical framework for understanding gradient-based learning algorithms, unifying various methods and loss functions, and demonstrating its applicability in both continuous and discrete settings with a Python implementation.

Contribution

It provides a novel categorical semantics for gradient algorithms, encompassing multiple methods and loss functions, and extends to discrete domains with practical implementation.

Findings

01

Unified categorical framework for gradient algorithms

02

Applicability to continuous and discrete domains

03

Python implementation demonstrating practical relevance

Abstract

We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realized in the discrete setting of boolean circuits. Finally, we demonstrate the practical significance of our framework with an implementation in Python.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Neural Networks and Applications · Ferroelectric and Negative Capacitance Devices

MethodsSoftmax · AdaGrad