Few-Shot Multilingual Open-Domain QA from 5 Examples

Fan Jiang; Tom Drummond; Trevor Cohn

arXiv:2502.19722·cs.CL·February 28, 2025

Few-Shot Multilingual Open-Domain QA from 5 Examples

Fan Jiang, Tom Drummond, Trevor Cohn

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a few-shot learning approach for multilingual open-domain question answering that leverages large language models to generate synthetic data, enabling effective performance with minimal annotated examples.

Contribution

The paper introduces FsModQA, a novel few-shot learning method that synthesizes multilingual training data from LLMs and extends to zero-shot language adaptation using cross-lingual prompting.

Findings

01

FsModQA outperforms existing few-shot and supervised baselines.

02

Effective zero-shot adaptation to new languages achieved.

03

Method reduces reliance on costly language-specific annotations.

Abstract

Recent approaches to multilingual open-domain question answering (MLODQA) have achieved promising results given abundant language-specific training data. However, the considerable annotation cost limits the application of these methods for underrepresented languages. We introduce a \emph{few-shot learning} approach to synthesise large-scale multilingual data from large language models (LLMs). Our method begins with large-scale self-supervised pre-training using WikiData, followed by training on high-quality synthetic multilingual data generated by prompting LLMs with few-shot supervision. The final model, \textsc{FsModQA}, significantly outperforms existing few-shot and supervised baselines in MLODQA and cross-lingual and monolingual retrieval. We further show our method can be extended for effective zero-shot adaptation to new languages through a \emph{cross-lingual prompting} strategy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Fantabulous-J/FSMODQA
pytorchOfficial

Videos

Few-shot Multilingual Open-domain QA from 5 Examples· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Expert finding and Q&A systems