AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph
Zhaowei Wang, Haochen Shi, Weiqi Wang, Tianqing Fang, Hongming Zhang,, Sehyun Choi, Xin Liu, Yangqiu Song

TL;DR
AbsPyramid is a comprehensive benchmark dataset of 221,000 textual descriptions designed to evaluate and improve the abstraction capabilities of language models across diverse events and domains.
Contribution
The paper introduces AbsPyramid, a large-scale, unified entailment graph that enables systematic evaluation and training of language models' abstraction abilities in open-domain settings.
Findings
Current LLMs struggle with abstraction in zero-shot and few-shot scenarios.
Training on AbsPyramid improves LLMs' abstraction skills and generalization.
The benchmark enhances performance on existing abstraction tasks.
Abstract
Cognitive research indicates that abstraction ability is essential in human intelligence, which remains under-explored in language models. In this paper, we present AbsPyramid, a unified entailment graph of 221K textual descriptions of abstraction knowledge. While existing resources only touch nouns or verbs within simplified events or specific domains, AbsPyramid collects abstract knowledge for three components of diverse events to comprehensively evaluate the abstraction ability of language models in the open domain. Experimental results demonstrate that current LLMs face challenges comprehending abstraction knowledge in zero-shot and few-shot settings. By training on our rich abstraction knowledge, we find LLMs can acquire basic abstraction abilities and generalize to unseen events. In the meantime, we empirically show that our benchmark is comprehensive to enhance LLMs across two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Machine Learning in Materials Science · Ferroelectric and Negative Capacitance Devices
