Understanding Robust Generalization in Learning Regular Languages

Soham Dan; Osbert Bastani; Dan Roth

arXiv:2202.09717·cs.LG·February 22, 2022

Understanding Robust Generalization in Learning Regular Languages

Soham Dan, Osbert Bastani, Dan Roth

PDF

Open Access

TL;DR

This paper investigates how recurrent neural networks can better generalize to longer strings in regular languages by proposing a compositional approach that predicts automaton structures, supported by theoretical proofs and empirical experiments.

Contribution

It introduces a compositional strategy for RNNs that predicts DFA structures, demonstrating improved robust generalization over standard end-to-end methods.

Findings

01

The compositional approach outperforms end-to-end models in generalization.

02

Auxiliary tasks improve robustness to distribution shifts.

03

End-to-end RNNs outperform theoretical lower bounds, indicating some inherent robustness.

Abstract

A key feature of human intelligence is the ability to generalize beyond the training distribution, for instance, parsing longer sentences than seen in the past. Currently, deep neural networks struggle to generalize robustly to such shifts in the data distribution. We study robust generalization in the context of using recurrent neural networks (RNNs) to learn regular languages. We hypothesize that standard end-to-end modeling strategies cannot generalize well to systematic distribution shifts and propose a compositional strategy to address this. We compare an end-to-end strategy that maps strings to labels with a compositional strategy that predicts the structure of the deterministic finite-state automaton (DFA) that accepts the regular language. We theoretically prove that the compositional strategy generalizes significantly better than the end-to-end strategy. In our experiments, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms

MethodsDirect Feedback Alignment