Structurally Diverse Sampling for Sample-Efficient Training and Comprehensive Evaluation
Shivanshu Gupta, Sameer Singh, Matt Gardner

TL;DR
This paper introduces a model-agnostic algorithm for sampling structurally diverse training instances, which enhances generalization, improves sample efficiency, and creates more challenging test sets in NLP tasks.
Contribution
The authors propose a novel, model-agnostic sampling algorithm for training on structurally diverse data, demonstrating improved generalization and efficiency across multiple NLP datasets.
Findings
Structurally diverse training improves generalization in 9 out of 10 dataset-split pairs.
Diverse sampling enhances sample efficiency compared to random training sets.
Diverse test sets are more challenging than IID test sets.
Abstract
A growing body of research has demonstrated the inability of NLP models to generalize compositionally and has tried to alleviate it through specialized architectures, training schemes, and data augmentation, among other approaches. In this work, we study a different approach: training on instances with diverse structures. We propose a model-agnostic algorithm for subsampling such sets of instances from a labeled instance pool with structured outputs. Evaluating on both compositional template splits and traditional IID splits of 5 semantic parsing datasets of varying complexity, we show that structurally diverse training using our algorithm leads to comparable or better generalization than prior algorithms in 9 out of 10 dataset-split type pairs. In general, we find structural diversity to consistently improve sample efficiency compared to random train sets. Moreover, we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
