Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models

Zikai Xie

arXiv:2408.05093·cs.CL·May 13, 2025·2 cites

Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models

Zikai Xie

PDF

Open Access 1 Repo

TL;DR

This paper investigates how the order of reasoning and answering affects large language models' consistency and hallucination issues, proposing a new benchmark and prompt strategy to improve their factual reliability.

Contribution

It introduces an order-based benchmark for assessing LLM consistency and a reflexive prompting method to reduce hallucinations and factual errors.

Findings

01

Order of reasoning impacts LLM consistency

02

The proposed prompt improves factual accuracy

03

Benchmark effectively identifies hallucination instances

Abstract

Large language models (LLMs) have generated significant attention since their inception, finding applications across various academic and industrial domains. However, these models often suffer from the "hallucination problem", where outputs, though grammatically and logically coherent, lack factual accuracy or are entirely fabricated. A particularly troubling issue discovered and widely discussed recently is the numerical comparison error where multiple LLMs incorrectly infer that "9.11 $>$ 9.9". We discovered that the order in which LLMs generate answers and reasoning impacts their consistency. Specifically, results vary significantly when an LLM generates an answer first and then provides the reasoning versus generating the reasoning process first and then the conclusion. Inspired by this, we propose a new benchmark method for assessing LLM consistency: comparing responses generated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiezikai/reflexiveprompting
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPsychiatry, Mental Health, Neuroscience · Mental Health and Psychiatry

MethodsSoftmax · Attention Is All You Need