TL;DR
This paper introduces LcRL, a reinforcement learning framework for multilingual retrieval-augmented generation that reduces knowledge bias and conflict across languages, improving performance in multilingual tasks.
Contribution
The paper proposes a novel language-coupled reinforcement learning approach with group sampling and anti-consistency regularization for multilingual retrieval-augmented generation.
Findings
LcRL achieves competitive performance across multilingual scenarios.
The framework effectively reduces knowledge bias and conflict.
It performs well with limited training data and large multilingual collections.
Abstract
Multilingual retrieval-augmented generation (MRAG) requires models to effectively acquire and integrate beneficial external knowledge from multilingual collections. However, most existing studies employ a unitive process where queries of equivalent semantics across different languages are processed through a single-turn retrieval and subsequent optimization. Such a ``one-size-fits-all'' strategy is often suboptimal in multilingual settings, as the models occur to knowledge bias and conflict during the interaction with the search engine. To alleviate the issues, we propose LcRL, a multilingual search-augmented reinforcement learning framework that integrates a language-coupled Group Relative Policy Optimization into the policy and reward models. We adopt the language-coupled group sampling in the rollout module to reduce knowledge bias, and regularize an auxiliary anti-consistency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
