Generative Query Expansion with Multilingual LLMs for Cross-Lingual Information Retrieval
Olivia Macmillan-Scott, Roksana Goworek, Eda B. \"Ozyi\u{g}it

TL;DR
This paper evaluates how multilingual large language models can be used for cross-lingual query expansion in information retrieval, highlighting the impact of query length, prompt complexity, and linguistic disparities on performance.
Contribution
It systematically assesses recent mLLMs and prompts for cross-lingual query expansion, revealing key factors influencing retrieval effectiveness and the limitations due to linguistic and script differences.
Findings
Query length significantly affects prompt effectiveness.
More complex prompts do not always improve results.
Languages with weaker baselines benefit most from expansion.
Abstract
Query expansion is the reformulation of a user query by adding semantically related information, and is an essential component of monolingual and cross-lingual information retrieval used to ensure that relevant documents are not missed. Recently, multilingual large language models (mLLMs) have shifted query expansion from semantic augmentation with synonyms and related words to pseudo-document generation. Pseudo-documents both introduce additional relevant terms and bridge the gap between short queries and long documents, which is particularly beneficial in dense retrieval. This study evaluates recent mLLMs and fine-tuned variants across several generative expansion strategies to identify factors that drive cross-lingual retrieval performance. Results show that query length largely determines which prompting technique is effective, and that more elaborate prompts often do not yield…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Semantic Web and Ontologies
