CURI: A Benchmark for Productive Concept Learning Under Uncertainty
Ramakrishna Vedantam, Arthur Szlam, Maximilian Nickel, Ari Morcos,, Brenden Lake

TL;DR
CURI is a new benchmark designed to evaluate how well models can learn and reason about complex, uncertain, and compositional concepts in a few-shot setting, addressing limitations of traditional classification tasks.
Contribution
The paper introduces CURI, a comprehensive benchmark for systematic generalization and reasoning under uncertainty across multiple modalities, with a model-independent measure of compositionality difficulty.
Findings
Models show significant room for improvement on CURI tasks.
Extensive evaluations reveal challenges in out-of-distribution generalization.
Benchmark spans images, schemas, and sounds, encouraging diverse modeling approaches.
Abstract
Humans can learn and reason under substantial uncertainty in a space of infinitely many concepts, including structured relational concepts ("a scene with objects that have the same color") and ad-hoc categories defined through goals ("objects that could fall on one's head"). In contrast, standard classification benchmarks: 1) consider only a fixed set of category labels, 2) do not evaluate compositional concept learning and 3) do not explicitly capture a notion of reasoning under uncertainty. We introduce a new few-shot, meta-learning benchmark, Compositional Reasoning Under Uncertainty (CURI) to bridge this gap. CURI evaluates different aspects of productive and systematic generalization, including abstract understandings of disentangling, productive generalization, learning boolean operations, variable binding, etc. Importantly, it also defines a model-independent "compositionality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Multimodal Machine Learning Applications
