Grounded Graph Decoding Improves Compositional Generalization in Question Answering
Yu Gai, Paras Jain, Wendi Zhang, Joseph E. Gonzalez, Dawn Song, Ion, Stoica

TL;DR
This paper introduces Grounded Graph Decoding, a novel approach that enhances compositional generalization in question answering by grounding structured predictions with attention, significantly outperforming existing models on challenging benchmarks.
Contribution
The paper proposes Grounded Graph Decoding, a new method that retains syntax information through grounding, improving generalization in complex question answering tasks.
Findings
Achieves 98% accuracy on MCD1 split of CFQ dataset.
Outperforms state-of-the-art baselines on compositional generalization benchmarks.
Effectively models structured graph predictions with attention grounding.
Abstract
Question answering models struggle to generalize to novel compositions of training patterns, such to longer sequences or more complex test structures. Current end-to-end models learn a flat input embedding which can lose input syntax context. Prior approaches improve generalization by learning permutation invariant models, but these methods do not scale to more complex train-test splits. We propose Grounded Graph Decoding, a method to improve compositional generalization of language representations by grounding structured predictions with an attention mechanism. Grounding enables the model to retain syntax information from the input in thereby significantly improving generalization over complex inputs. By predicting a structured graph containing conjunctions of query clauses, we learn a group invariant representation without making assumptions on the target domain. Our model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
