Compositional generalization in a deep seq2seq model by separating syntax and semantics
Jake Russin, Jason Jo, Randall C. O'Reilly, Yoshua Bengio

TL;DR
This paper introduces Syntactic Attention, a neural model that separates syntax and semantics to improve compositional generalization in language tasks, outperforming standard models on the SCAN dataset.
Contribution
The paper proposes a novel neural architecture that separates syntactic and semantic processing, inspired by neuroscience, to enhance compositional generalization in NLP.
Findings
Syntactic Attention outperforms standard models on SCAN.
Separation of syntax and semantics improves systematic generalization.
Model does not require hand-engineered features or extra supervision.
Abstract
Standard methods in deep learning for natural language processing fail to capture the compositional structure of human language that allows for systematic generalization outside of the training distribution. However, human learners readily generalize in this way, e.g. by applying known grammatical rules to novel words. Inspired by work in neuroscience suggesting separate brain systems for syntactic and semantic processing, we implement a modification to standard approaches in neural machine translation, imposing an analogous separation. The novel model, which we call Syntactic Attention, substantially outperforms standard methods in deep learning on the SCAN dataset, a compositional generalization task, without any hand-engineered features or additional supervision. Our work suggests that separating syntactic from semantic learning may be a useful heuristic for capturing compositional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
