Can ChatGPT Make Explanatory Inferences? Benchmarks for Abductive Reasoning
Paul Thagard

TL;DR
This paper introduces benchmarks to evaluate AI's ability to perform explanatory inference and assesses ChatGPT's capabilities, showing it can make creative and evaluative inferences despite modality limitations.
Contribution
It presents novel benchmarks for abductive reasoning and demonstrates ChatGPT's proficiency in explanatory inference across multiple domains.
Findings
ChatGPT performs well in creative and evaluative inferences.
It is limited to verbal and visual modalities.
Claims of AI incapacity in explanation and reasoning are challenged.
Abstract
Explanatory inference is the creation and evaluation of hypotheses that provide explanations, and is sometimes known as abduction or abductive inference. Generative AI is a new set of artificial intelligence models based on novel algorithms for generating text, images, and sounds. This paper proposes a set of benchmarks for assessing the ability of AI programs to perform explanatory inference, and uses them to determine the extent to which ChatGPT, a leading generative AI model, is capable of making explanatory inferences. Tests on the benchmarks reveal that ChatGPT performs creative and evaluative inferences in many domains, although it is limited to verbal and visual modalities. Claims that ChatGPT and similar models are incapable of explanation, understanding, causal reasoning, meaning, and creativity are rebutted.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Machine Learning in Healthcare
MethodsSparse Evolutionary Training
