Test-Time Scaling with Repeated Sampling Improves Multilingual Text Generation

Ashim Gupta; Vivek Srikumar

arXiv:2505.21941·cs.CL·May 29, 2025

Test-Time Scaling with Repeated Sampling Improves Multilingual Text Generation

Ashim Gupta, Vivek Srikumar

PDF

Open Access

TL;DR

This paper demonstrates that test-time scaling with repeated sampling enhances multilingual text generation quality, especially when using reward-based verifiers for reasoning tasks, across multiple benchmarks.

Contribution

It introduces a novel evaluation of repeated sampling at inference time for multilingual generation, highlighting the importance of verifier selection for different task types.

Findings

01

Repeated sampling improves multilingual generation quality by over 35%.

02

Reward-based verifiers outperform perplexity-based scoring on reasoning tasks.

03

Perplexity scoring is effective for open-ended prompts.

Abstract

Inference-time scaling via repeated sampling has shown promise in reasoning tasks, but its effectiveness in multilingual generation remains underexplored. We evaluate this approach using perplexity- and reward-based verifiers on two multilingual benchmarks: the Aya Evaluation Suite and m-ArenaHard. Our results show consistent quality improvements, with gains exceeding 35% in some cases. While perplexity-based scoring is effective for open-ended prompts, only reward-based verifiers improve performance on tasks requiring reasoning (e.g., math, code). Our results demonstrate the broader utility of repeated sampling for multilingual text generation and underscore the importance of selecting right verifiers for the task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification