GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Federico A. Kamelhar

arXiv:2604.23366·cs.AI·April 28, 2026

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Federico A. Kamelhar

PDF

TL;DR

GSAR introduces a novel framework for evaluating and improving the grounding of claims in multi-agent LLM systems by categorizing claims, weighting evidence, and guiding decision-making under compute constraints.

Contribution

It is the first groundedness framework combining evidence-typed scoring with tiered recovery and explicit compute budgeting for multi-agent LLMs.

Findings

01

GSAR achieves consistent evaluation across multiple LLM judges.

02

Ablation studies confirm the importance of complementary evidence in grounding.

03

GSAR outperforms existing methods like Vectara HHEM-2.1-Open in grounding accuracy.

Abstract

Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is grounded in observed evidence rather than model-internal inference. Existing groundedness evaluators (binary classifiers, LLM-as-judge scalars, self-correction loops) treat supporting evidence as interchangeable and emit a single signal that offers no principled control over downstream action. We present GSAR, a grounding-evaluation and replanning framework that (i) partitions claims into a four-way typology (grounded, ungrounded, contradicted, complementary), giving first-class standing to non-redundant alternative perspectives; (ii) assigns evidence-type-specific weights reflecting epistemic strength; (iii) computes an asymmetric contradiction-penalised weighted groundedness score; and (iv)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.