Evaluating the Meta- and Object-Level Reasoning of Large Language Models   for Question Answering

Nick Ferguson; Liane Guillou; Alan Bundy; Kwabena Nuamah

arXiv:2502.10338·cs.CL·February 17, 2025

Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering

Nick Ferguson, Liane Guillou, Alan Bundy, Kwabena Nuamah

PDF

Open Access

TL;DR

This paper evaluates large language models' ability to perform complex question answering tasks involving meta- and object-level reasoning, revealing strengths in strategic reasoning but challenges in detailed, lower-level reasoning.

Contribution

Introduces the Franklin dataset to assess meta- and object-level reasoning in LLMs and provides a comprehensive evaluation of their reasoning capabilities.

Findings

01

LLMs frequently demonstrate meta-level reasoning

02

LLMs struggle with object-level reasoning in some datasets

03

LLMs perform well on meta-level reasoning in the Franklin dataset

Abstract

Large Language Models (LLMs) excel in natural language tasks but still face challenges in Question Answering (QA) tasks requiring complex, multi-step reasoning. We outline the types of reasoning required in some of these tasks, and reframe them in terms of meta-level reasoning (akin to high-level strategic reasoning or planning) and object-level reasoning (embodied in lower-level tasks such as mathematical reasoning). Franklin, a novel dataset with requirements of meta- and object-level reasoning, is introduced and used along with three other datasets to evaluate four LLMs at question answering tasks requiring multiple steps of reasoning. Results from human annotation studies suggest LLMs demonstrate meta-level reasoning with high frequency, but struggle with object-level reasoning tasks in some of the datasets used. Additionally, evidence suggests that LLMs find the object-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems