Universal Approximation Under Constraints is Possible with Transformers

Anastasis Kratsios; Behnoosh Zamanlooy; Tianlin Liu; Ivan Dokmani\'c

arXiv:2110.03303·cs.LG·February 10, 2022

Universal Approximation Under Constraints is Possible with Transformers

Anastasis Kratsios, Behnoosh Zamanlooy, Tianlin Liu, Ivan Dokmani\'c

PDF

1 Video

TL;DR

This paper proves that transformers can universally approximate functions under constraints, ensuring outputs satisfy specific sets, and extends classical theorems to constrained and manifold-valued functions.

Contribution

It introduces a constrained universal approximation theorem for transformers and a deep neural version of Berge's Maximum Theorem, enabling constrained optimization.

Findings

01

Transformers can exactly encode constraints while approximating functions.

02

Universal approximation theorem now applies to convex and non-convex constraint sets.

03

Results include approximation for Riemannian manifold-valued functions with geodesic convexity.

Abstract

Many practical problems need the output of a machine learning model to satisfy a set of constraints, $K$ . Nevertheless, there is no known guarantee that classical neural network architectures can exactly encode constraints while simultaneously achieving universality. We provide a quantitative constrained universal approximation theorem which guarantees that for any non-convex compact set $K$ and any continuous function $f : R^{n} \to K$ , there is a probabilistic transformer $\hat{F}$ whose randomized outputs all lie in $K$ and whose expected output uniformly approximates $f$ . Our second main result is a "deep neural version" of Berge's Maximum Theorem (1963). The result guarantees that given an objective function $L$ , a constraint set $K$ , and a family of soft constraint sets, there is a probabilistic transformer $\hat{F}$ that approximately minimizes $L$ and whose outputs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Universal Approximation Under Constraints is Possible with Transformers· slideslive