Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples

Soumya Suvra Ghosal; Vaibhav Singh; Akash Ghosh; Soumyabrata Pal; Subhadip Baidya; Sriparna Saha; Dinesh Manocha

arXiv:2506.16502·cs.CL·June 23, 2025

Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples

Soumya Suvra Ghosal, Vaibhav Singh, Akash Ghosh, Soumyabrata Pal, Subhadip Baidya, Sriparna Saha, Dinesh Manocha

PDF

Open Access 1 Video

TL;DR

RELIC is a novel in-context learning framework that improves reward model accuracy for low-resource Indic languages by selecting effective examples from high-resource languages, addressing data scarcity issues.

Contribution

RELIC introduces a retriever-based in-context learning approach for reward modeling in low-resource languages, enhancing performance without extensive data collection.

Findings

01

RELIC significantly improves reward model accuracy for low-resource languages.

02

On Bodo, RELIC outperforms zero-shot prompting by 12.81%.

03

RELIC surpasses existing methods in multiple datasets.

Abstract

Reward models are essential for aligning large language models (LLMs) with human preferences. However, most open-source multilingual reward models are primarily trained on preference datasets in high-resource languages, resulting in unreliable reward signals for low-resource Indic languages. Collecting large-scale, high-quality preference data for these languages is prohibitively expensive, making preference-based training approaches impractical. To address this challenge, we propose RELIC, a novel in-context learning framework for reward modeling in low-resource Indic languages. RELIC trains a retriever with a pairwise ranking objective to select in-context examples from auxiliary high-resource languages that most effectively highlight the distinction between preferred and less-preferred responses. Extensive experiments on three preference datasets- PKU-SafeRLHF, WebGPT, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RELIC: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples· underline

Taxonomy

TopicsRecommender Systems and Techniques · Topic Modeling · Machine Learning and Data Classification