Semantic Grounding Index: Geometric Bounds on Context Engagement in RAG Systems

Javier Mar\'in

arXiv:2512.13771·cs.AI·December 17, 2025

Semantic Grounding Index: Geometric Bounds on Context Engagement in RAG Systems

Javier Mar\'in

PDF

Open Access 1 Datasets

TL;DR

The paper introduces the Semantic Grounding Index (SGI), a geometric measure in embedding space that detects when RAG system responses are semantically lazy or hallucinated, with strong empirical validation and theoretical backing.

Contribution

It proposes the SGI metric based on angular distances, providing a new, theoretically grounded tool for assessing response engagement in RAG systems.

Findings

01

SGI effectively detects semantic laziness in RAG responses.

02

SGI's discriminative power increases with question-context angular separation.

03

SGI scores correlate with response length and question brevity.

Abstract

When retrieval-augmented generation (RAG) systems hallucinate, what geometric trace does this leave in embedding space? We introduce the Semantic Grounding Index (SGI), defined as the ratio of angular distances from the response to the question versus the context on the unit hypersphere $S^{d - 1}$ .Our central finding is \emph{semantic laziness}: hallucinated responses remain angularly proximate to questions rather than departing toward retrieved contexts. On HaluEval ( $n$ =5,000), we observe large effect sizes (Cohen's $d$ ranging from 0.92 to 1.28) across five embedding models with mean cross-model correlation $r$ =0.85. Crucially, we derive from the spherical triangle inequality that SGI's discriminative power should increase with question-context angular separation $θ (q, c)$ -a theoretical prediction confirmed empirically: effect size rises monotonically from $d$ =0.61 -low…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

AICoevolution/s64-geometry-v1
dataset· 39 dl
39 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Deception detection and forensic psychology