Dr. SoW: Density Ratio of Strong-over-weak LLMs for Reducing the Cost of   Human Annotation in Preference Tuning

Guangxuan Xu; Kai Xu; Shivchander Sudalairaj; Hao Wang; Akash; Srivastava

arXiv:2411.02481·cs.CL·February 4, 2025

Dr. SoW: Density Ratio of Strong-over-weak LLMs for Reducing the Cost of Human Annotation in Preference Tuning

Guangxuan Xu, Kai Xu, Shivchander Sudalairaj, Hao Wang, Akash, Srivastava

PDF

Open Access

TL;DR

This paper introduces Dr.SoW, a cost-effective method using LLMs' density ratios to replace human annotation in preference tuning, achieving high performance without additional fine-tuning.

Contribution

We propose Dr.SoW, a novel approach leveraging LLM density ratios for preference data annotation, reducing reliance on costly human labeling and enabling domain-specific reward customization.

Findings

01

Strong correlation between model performance gap and reward quality.

02

Dr.SoW achieves high scores on RewardBench and competitive safety and reasoning metrics.

03

Preference-tuned Llama-3-8B-Instruct shows significant performance improvements.

Abstract

Preference tuning relies on high-quality human preference data, which is often expensive and time-consuming to gather. In this paper, we introduce Dr.SoW (Density Ratio of Strong over Weak) a cost-effective method that eliminates the reliance for human annotation by leveraging off-the-shelf LLMs for preference data annotation. Dr.SoW uses the log-density ratio between a better-aligned and a less-aligned LLM as a reward signal. We evaluate Dr.SoW across 221 different LLM pairs and empirically find a strong correlation between the performance gap of the paired models and the quality of the reward signal. This insight provides a practical guideline for selecting LLMs for data annotation. Additionally, we introduce an end-to-end pipeline that customizes reward functions based on user query domains. Without fine-tuning, it improves accuracy on domain-specific evaluations. With a pair of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Rough Sets and Fuzzy Logic · Data Mining Algorithms and Applications