CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation

Aashiq Muhamed

arXiv:2506.20128·cs.CL·June 26, 2025

CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation

Aashiq Muhamed

PDF

Open Access

TL;DR

This paper introduces CCRS, a zero-shot, LLM-based evaluation framework that comprehensively assesses RAG system outputs across multiple quality dimensions with high efficiency.

Contribution

The paper presents CCRS, a novel suite of five metrics using a single pretrained LLM for zero-shot, end-to-end RAG output evaluation, improving efficiency and discriminative power.

Findings

01

CCRS effectively distinguishes performance differences among RAG systems.

02

CCRS outperforms or matches complex frameworks like RAGChecker in key evaluation aspects.

03

CCRS is significantly more computationally efficient than existing multi-stage evaluation methods.

Abstract

RAG systems enhance LLMs by incorporating external knowledge, which is crucial for domains that demand factual accuracy and up-to-date information. However, evaluating the multifaceted quality of RAG outputs, spanning aspects such as contextual coherence, query relevance, factual correctness, and informational completeness, poses significant challenges. Existing evaluation methods often rely on simple lexical overlap metrics, which are inadequate for capturing these nuances, or involve complex multi-stage pipelines with intermediate steps like claim extraction or require finetuning specialized judge models, hindering practical efficiency. To address these limitations, we propose CCRS (Contextual Coherence and Relevance Score), a novel suite of five metrics that utilizes a single, powerful, pretrained LLM as a zero-shot, end-to-end judge. CCRS evaluates: Contextual Coherence (CC),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiation Dose and Imaging

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Dropout · Byte Pair Encoding · Softmax · Dense Connections · Layer Normalization · Linear Warmup With Linear Decay · BERT · BART