DICE: Discrete Interpretable Comparative Evaluation with Probabilistic Scoring for Retrieval-Augmented Generation

Shiyan Liu; Jian Ma; Rui Qu

arXiv:2512.22629·cs.AI·December 30, 2025

DICE: Discrete Interpretable Comparative Evaluation with Probabilistic Scoring for Retrieval-Augmented Generation

Shiyan Liu, Jian Ma, Rui Qu

PDF

Open Access

TL;DR

DICE introduces an explainable, robust, and efficient evaluation framework for RAG systems using probabilistic scoring and a tournament approach, improving interpretability and reducing computational costs.

Contribution

The paper presents DICE, a novel evaluation method combining probabilistic scoring and a tournament strategy to enhance explainability and efficiency in RAG system assessment.

Findings

01

Achieves 85.7% agreement with human judgments.

02

Reduces evaluation complexity by 42.9%.

03

Outperforms existing metrics like RAGAS.

Abstract

As Retrieval-Augmented Generation (RAG) systems evolve toward more sophisticated architectures, ensuring their trustworthiness through explainable and robust evaluation becomes critical. Existing scalar metrics suffer from limited interpretability, inadequate uncertainty quantification, and computational inefficiency in multi-system comparisons, hindering responsible deployment of RAG technologies. We introduce DICE (Discrete Interpretable Comparative Evaluation), a two-stage, evidence-coupled framework that advances explainability and robustness in RAG evaluation. DICE combines deep analytical reasoning with probabilistic ${A, B, T i e}$ scoring to produce transparent, confidence-aware judgments that support accountable system improvement through interpretable reasoning traces, enabling systematic error diagnosis and actionable insights. To address efficiency challenges at scale, DICE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science