CroSearch-R1: Better Leveraging Cross-lingual Knowledge for Retrieval-Augmented Generation
Rui Qi, Fengran Mo, Sijin Lu, Yufeng Chen, Jian-Yun Nie, Kaiyu Huang

TL;DR
CroSearch-R1 is a novel framework that enhances multilingual retrieval-augmented generation by dynamically integrating cross-lingual knowledge through reinforcement learning and multi-turn retrieval strategies.
Contribution
It introduces a search-augmented reinforcement learning approach with multilingual retrieval and a rollout mechanism to better leverage cross-lingual knowledge in RAG.
Findings
Improves RAG effectiveness with multilingual collections.
Effectively leverages cross-lingual complementarity.
Demonstrates superior performance in experiments.
Abstract
A multilingual collection may contain useful knowledge in other languages to supplement and correct the facts in the original language for Retrieval-Augmented Generation (RAG). However, the vanilla approach that simply concatenates multiple pieces of knowledge from different languages into the context may fail to improve effectiveness due to the potential disparities across languages. To better leverage multilingual knowledge, we propose CroSearch-R1, a search-augmented reinforcement learning framework to integrate multilingual knowledge into the Group Relative Policy Optimization (GRPO) process. In particular, the approach adopts a multi-turn retrieval strategy with cross-lingual knowledge integration to dynamically align the knowledge from other languages as supplementary evidence into a unified representation space. Furthermore, we introduce a multilingual rollout mechanism to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
