On the Ability and Limitations of Transformers to Recognize Formal   Languages

Satwik Bhattamishra; Kabir Ahuja; Navin Goyal

arXiv:2009.11264·cs.CL·October 9, 2020

On the Ability and Limitations of Transformers to Recognize Formal Languages

Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

PDF

1 Repo

TL;DR

This paper investigates the capabilities and limitations of Transformer models in recognizing formal languages, revealing their strengths on certain subclasses and challenges with more complex regular languages, while analyzing the impact of model components.

Contribution

The study provides a systematic analysis of Transformers' ability to model formal languages, including a construction for counter languages and insights into the roles of self-attention and positional encoding.

Findings

01

Transformers perform well on a subclass of counter languages.

02

Transformers' performance degrades on more complex regular languages.

03

Self-attention and positional encoding significantly influence learning and generalization.

Abstract

Transformers have supplanted recurrent models in a large number of NLP tasks. However, the differences in their abilities to model different syntactic properties remain largely unknown. Past works suggest that LSTMs generalize very well on regular languages and have close connections with counter languages. In this work, we systematically study the ability of Transformers to model such languages as well as the role of its individual components in doing so. We first provide a construction of Transformers for a subclass of counter languages, including well-studied languages such as n-ary Boolean Expressions, Dyck-1, and its generalizations. In experiments, we find that Transformers do well on this subclass, and their learned mechanism strongly correlates with our construction. Perhaps surprisingly, in contrast to LSTMs, Transformers do well only on a subset of regular languages with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

satwik77/Transformer-Formal-Languages
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.