CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning
Weiqi Wang, Tianqing Fang, Chunyang Li, Haochen Shi, Wenxuan Ding,, Baixuan Xu, Zhaowei Wang, Jiaxin Bai, Xin Liu, Jiayang Cheng, Chunkit Chan,, Yangqiu Song

TL;DR
CANDLE is a framework that enhances commonsense reasoning by iteratively distilling and generating conceptualizations and instantiations from large language models, creating a large, high-quality knowledge base that improves downstream tasks.
Contribution
It introduces an iterative distillation method for generating and filtering conceptual and instantiated knowledge from LLMs, reducing reliance on manual annotations and expanding reasoning capabilities.
Findings
Constructed a 6-million triple knowledge base from ATOMIC.
Demonstrated improved performance on four downstream tasks.
Produced high-quality, diverse knowledge through critic filtering.
Abstract
The sequential process of conceptualization and instantiation is essential to generalizable commonsense reasoning as it allows the application of existing knowledge to unfamiliar scenarios. However, existing works tend to undervalue the step of instantiation and heavily rely on pre-built concept taxonomies and human annotations to collect both types of knowledge, resulting in a lack of instantiated knowledge to complete reasoning, high cost, and limited scalability. To tackle these challenges, we introduce CANDLE, a distillation framework that iteratively performs contextualized conceptualization and instantiation over commonsense knowledge bases by instructing large language models to generate both types of knowledge with critic filtering. By applying CANDLE to ATOMIC, we construct a comprehensive knowledge base comprising six million conceptualizations and instantiated commonsense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsBalanced Selection
