InspectorRAGet: An Introspection Platform for RAG Evaluation

Kshitij Fadnis; Siva Sankalp Patel; Odellia Boni; Yannis Katsis; Sara; Rosenthal; Benjamin Sznajder; Marina Danilevsky

arXiv:2404.17347·cs.SE·May 5, 2025

InspectorRAGet: An Introspection Platform for RAG Evaluation

Kshitij Fadnis, Siva Sankalp Patel, Odellia Boni, Yannis Katsis, Sara, Rosenthal, Benjamin Sznajder, Marina Danilevsky

PDF

Open Access 1 Repo 1 Video

TL;DR

InspectorRAGet is a comprehensive, publicly available platform designed for detailed analysis and evaluation of RAG system outputs, integrating human and algorithmic metrics for improved assessment.

Contribution

It introduces a novel introspection platform that enables detailed, multi-level analysis of RAG systems, addressing the lack of comprehensive evaluation tools.

Findings

01

Supports aggregate and instance-level analysis

02

Incorporates human and algorithmic metrics

03

Available publicly for community use

Abstract

Large Language Models (LLM) have become a popular approach for implementing Retrieval Augmented Generation (RAG) systems, and a significant amount of effort has been spent on building good models and metrics. In spite of increased recognition of the need for rigorous evaluation of RAG systems, few tools exist that go beyond the creation of model output and automatic calculation. We present InspectorRAGet, an introspection platform for performing a comprehensive analysis of the quality of RAG system output. InspectorRAGet allows the user to analyze aggregate and instance-level performance of RAG systems, using both human and algorithmic metrics as well as annotator quality. InspectorRAGet is suitable for multiple use cases and is available publicly to the community. A live instance of the platform is available at https://ibm.biz/InspectorRAGet.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ibm/inspectorraget
noneOfficial

Videos

InspectorRAGet: An Introspection Platform for RAG Evaluation· underline

Taxonomy

TopicsMedical Imaging Techniques and Applications