Understanding Robust Generalization in Learning Regular Languages
Soham Dan, Osbert Bastani, Dan Roth

TL;DR
This paper investigates how recurrent neural networks can better generalize to longer strings in regular languages by proposing a compositional approach that predicts automaton structures, supported by theoretical proofs and empirical experiments.
Contribution
It introduces a compositional strategy for RNNs that predicts DFA structures, demonstrating improved robust generalization over standard end-to-end methods.
Findings
The compositional approach outperforms end-to-end models in generalization.
Auxiliary tasks improve robustness to distribution shifts.
End-to-end RNNs outperform theoretical lower bounds, indicating some inherent robustness.
Abstract
A key feature of human intelligence is the ability to generalize beyond the training distribution, for instance, parsing longer sentences than seen in the past. Currently, deep neural networks struggle to generalize robustly to such shifts in the data distribution. We study robust generalization in the context of using recurrent neural networks (RNNs) to learn regular languages. We hypothesize that standard end-to-end modeling strategies cannot generalize well to systematic distribution shifts and propose a compositional strategy to address this. We compare an end-to-end strategy that maps strings to labels with a compositional strategy that predicts the structure of the deterministic finite-state automaton (DFA) that accepts the regular language. We theoretically prove that the compositional strategy generalizes significantly better than the end-to-end strategy. In our experiments, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms
MethodsDirect Feedback Alignment
