DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Yun-Nung Chen

TL;DR
DogeRM is a framework that integrates domain-specific knowledge into reward models via model merging, improving alignment efficiency and performance in reinforcement learning from human feedback.
Contribution
The paper introduces DogeRM, a novel method for incorporating domain knowledge into reward models through model merging, reducing data collection costs and enhancing model alignment.
Findings
Improved performance across multiple benchmarks.
Effective integration of domain knowledge into reward models.
Potential to reduce annotation costs in RLHF.
Abstract
Reinforcement learning from human feedback (RLHF) is a popular strategy for aligning large language models (LLMs) with desired behaviors. Reward modeling is a crucial step in RLHF. However, collecting paired preference data for training reward models is often costly and time-consuming, especially for domain-specific preferences requiring expert annotation. To address this challenge, we propose the \textbf{Do}main knowled\textbf{ge} merged \textbf{R}eward \textbf{M}odel (DogeRM), a novel framework that integrates domain-specific knowledge into a general reward model by model merging. The experiments demonstrate that DogeRM enhances performance across different benchmarks and provide a detailed analysis showcasing the effects of model merging, showing the great potential of facilitating model alignment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗miulab/llama2-7b-ultrafeedback-rmmodel· 16 dl· ♡ 116 dl♡ 1
- 🤗miulab/llama2-7b-magicoder-evol-instructmodel· 4 dl4 dl
- 🤗miulab/llama2-7b-alpaca-sft-10kmodel· 5 dl5 dl
- 🤗miulab/llama2-7b-oss-instructmodel· 4 dl4 dl
- 🤗MachoMaheen/devdock4bitmodel
- 🤗RichardErkhov/miulab_-_llama2-7b-alpaca-sft-10k-8bitsmodel· 1 dl1 dl
- 🤗sicer/arc-agi-legacymodel
- 🤗JilinHu/llemma_7b_3epoch_r32_e5_RQ1model· 1 dl1 dl
- 🤗Xin-Rui/LLAMA-Fac-NEW-A800model· ♡ 1♡ 1
- 🤗Linksome/lmfmodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
