Approximation of relation functions and attention mechanisms
Awni Altabaa, John Lafferty

TL;DR
This paper investigates the approximation capabilities of neural network inner products for relation functions, demonstrating universality for symmetric and asymmetric cases, and applies these findings to analyze Transformer attention mechanisms.
Contribution
It establishes the universality of neural network inner products for relation functions and connects these results to attention mechanisms in Transformers.
Findings
Inner products of neural networks can universally approximate relation functions.
Bounded the number of neurons needed for a given approximation accuracy.
Applied approximation results to analyze and represent Transformer attention mechanisms.
Abstract
Inner products of neural network feature maps arise in a wide variety of machine learning frameworks as a method of modeling relations between inputs. This work studies the approximation properties of inner products of neural networks. It is shown that the inner product of a multi-layer perceptron with itself is a universal approximator for symmetric positive-definite relation functions. In the case of asymmetric relation functions, it is shown that the inner product of two different multi-layer perceptrons is a universal approximator. In both cases, a bound is obtained on the number of neurons required to achieve a given accuracy of approximation. In the symmetric case, the function class can be identified with kernels of reproducing kernel Hilbert spaces, whereas in the asymmetric case the function class can be identified with kernels of reproducing kernel Banach spaces. Finally,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCognitive Science and Mapping · Cognitive Computing and Networks
