LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Haitao Li, Qian Dong, Junjie Chen, Huixue Su, Yujia Zhou, Qingyao Ai,, Ziyi Ye, Yiqun Liu

TL;DR
This survey comprehensively reviews the emerging paradigm of using Large Language Models as evaluators, covering their functionality, methodologies, applications, evaluation techniques, limitations, and future directions.
Contribution
It systematically defines LLMs-as-Judges, analyzes their methodologies, applications, and limitations, and discusses future research directions in this rapidly growing field.
Findings
LLMs demonstrate strong effectiveness and generalization in evaluation tasks.
Various methodologies exist for constructing LLM-based evaluation systems.
Identified limitations include biases and interpretability challenges.
Abstract
The rapid advancement of Large Language Models (LLMs) has driven their expanding application across various fields. One of the most promising applications is their role as evaluators based on natural language responses, referred to as ''LLMs-as-judges''. This framework has attracted growing attention from both academia and industry due to their excellent effectiveness, ability to generalize across tasks, and interpretability in the form of natural language. This paper presents a comprehensive survey of the LLMs-as-judges paradigm from five key perspectives: Functionality, Methodology, Applications, Meta-evaluation, and Limitations. We begin by providing a systematic definition of LLMs-as-Judges and introduce their functionality (Why use LLM judges?). Then we address methodology to construct an evaluation system with LLMs (How to use LLM judges?). Additionally, we investigate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLegal Education and Practice Innovations · Artificial Intelligence in Law · Law, AI, and Intellectual Property
MethodsSoftmax · Attention Is All You Need
