Lower bounds on transformers with infinite precision
Alexander Kozachinskiy

TL;DR
This paper establishes the first theoretical lower bounds for one-layer softmax transformers with infinite precision on specific tasks, using VC dimension techniques to analyze their limitations.
Contribution
It introduces the first lower bounds for one-layer softmax transformers with infinite precision on function composition and SUM$_2$ tasks, advancing theoretical understanding.
Findings
Lower bounds proven for function composition task
Lower bounds proven for SUM$_2$ task
Uses VC dimension technique for analysis
Abstract
In this note, we use the VC dimension technique to prove the first lower bound against one-layer softmax transformers with infinite precision. We do so for two tasks: function composition, considered by Peng, Narayanan, and Papadimitriou, and the SUM task, considered by Sanford, Hsu, and Telgarsky.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical methods in inverse problems · Advanced Mathematical Modeling in Engineering · Matrix Theory and Algorithms
MethodsSoftmax
