EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs
Yijie Li, Yuan Sun

TL;DR
EasyJudge is a lightweight, user-friendly evaluation tool for LLM responses that offers high accuracy, efficiency, and visualization features, addressing limitations of existing evaluation methods.
Contribution
The paper introduces EasyJudge, an open-source, optimized evaluation tool with visualization, designed for efficient and accurate assessment of LLM responses.
Findings
Achieves strong consistency with human evaluations.
Runs efficiently on consumer-grade hardware.
Provides an intuitive visualization interface.
Abstract
Recently, there has been a growing trend of employing large language models (LLMs) to judge the quality of other LLMs. Many studies have adopted closed-source models, mainly using GPT-4 as the evaluator. However, due to the closed-source nature of the GPT-4 model, employing it as an evaluator has resulted in issues including transparency, controllability, and cost-effectiveness. Some researchers have turned to using fine-tuned open-source LLMs as evaluators. However, existing open-source evaluation LLMs generally lack a user-friendly visualization tool, and they have not been optimized for accelerated model inference, which causes inconvenience for researchers with limited resources and those working across different fields. This paper presents EasyJudge, a model developed to evaluate significant language model responses. It is lightweight, precise, efficient, and user-friendly,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
MethodsAttention Is All You Need · Dropout · Layer Normalization · Adam · Dense Connections · Residual Connection · Position-Wise Feed-Forward Layer · Linear Layer · Byte Pair Encoding · Absolute Position Encodings
