Think-J: Learning to Think for Generative LLM-as-a-Judge
Hui Huang, Yancheng He, Hongli Zhou, Rui Zhang, Wei Liu, Weixun Wang, Jiaheng Liu, Wenbo Su

TL;DR
This paper introduces Think-J, a method that enhances LLMs as judges by teaching them to think through curated data and reinforcement learning, significantly improving their response evaluation capabilities without extra human annotations.
Contribution
We propose a novel approach that enables LLMs to learn judgment thinking via RL, improving their evaluation performance without additional human-labeled data.
Findings
Significant improvement in LLM-Judge evaluation accuracy
Outperforms existing generative and classifier-based judges
Does not require extra human annotations
Abstract
LLM-as-a-Judge refers to the automatic modeling of preferences for responses generated by Large Language Models (LLMs), which is of significant importance for both LLM evaluation and reward modeling. Although generative LLMs have made substantial progress in various tasks, their performance as LLM-Judge still falls short of expectations. In this work, we propose Think-J, which improves generative LLM-as-a-Judge by learning how to think. We first utilized a small amount of curated data to develop the model with initial judgment thinking capabilities. Subsequently, we optimize the judgment thinking traces based on reinforcement learning (RL). We propose two methods for judgment thinking optimization, based on offline and online RL, respectively. The offline method requires training a critic model to construct positive and negative examples for learning. The online method defines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Legal Education and Practice Innovations · Comparative and International Law Studies
