A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics
Qing Li, Siyuan Huang, Yining Hong, Yixin Zhu, Ying Nian Wu, Song-Chun, Zhu

TL;DR
This paper introduces HINT, a minimalistic dataset designed to evaluate machine learning models' ability to generalize systematically across perception, syntax, and semantics in arithmetic tasks, revealing current models' limitations and potential improvements.
Contribution
The paper presents HINT, a new dataset for testing systematic generalization in perception, syntax, and semantics, along with extensive experiments highlighting current models' challenges and the effectiveness of chain of thought prompting.
Findings
Models struggle with long-range syntactic dependencies.
Scaling dataset and model size yields limited improvements.
Chain of thought prompting significantly improves GPT-3 zero-shot performance.
Abstract
Inspired by humans' exceptional ability to master arithmetic and generalize to new problems, we present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts at three levels: perception, syntax, and semantics. In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images (i.e., perception), how multiple concepts are structurally combined to form a valid expression (i.e., syntax), and how concepts are realized to afford various reasoning tasks (i.e., semantics), all in a weakly supervised manner. Focusing on systematic generalization, we carefully design a five-fold test set to evaluate both the interpolation and the extrapolation of learned concepts w.r.t. the three levels. Further, we design a few-shot learning split to determine whether or not models can rapidly learn new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Topic Modeling
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Layer Normalization · Residual Connection · Adam · Dropout · Label Smoothing · Multi-Head Attention
