IndoPref: A Multi-Domain Pairwise Preference Dataset for Indonesian

Vanessa Rebecca Wiyono; David Anugraha; Ayu Purwarianti; Genta Indra Winata

arXiv:2507.22159·cs.CL·November 13, 2025

IndoPref: A Multi-Domain Pairwise Preference Dataset for Indonesian

Vanessa Rebecca Wiyono, David Anugraha, Ayu Purwarianti, Genta Indra Winata

PDF

TL;DR

IndoPref is a pioneering Indonesian preference dataset created by humans across multiple domains, enabling better evaluation of LLMs' performance in Indonesian language and cultural context.

Contribution

This work introduces the first fully human-authored, multi-domain Indonesian preference dataset for evaluating LLMs, addressing the lack of culturally authentic Indonesian benchmarks.

Findings

01

522 prompts with 4,099 pairwise preferences collected

02

High inter-annotator agreement indicating reliable annotations

03

Benchmark covers 10 diverse categories for detailed LLM evaluation

Abstract

Over 200 million people speak Indonesian, yet the language remains significantly underrepresented in preference-based research for large language models (LLMs). Most existing multilingual datasets are derived from English translations, often resulting in content that lacks cultural and linguistic authenticity. To address this gap, we introduce IndoPref, the first fully human-authored and multi-domain Indonesian preference dataset designed to evaluate the naturalness and quality of LLM-generated text. The dataset contains 522 prompts and yields 4,099 human-annotated pairwise preferences from comparisons across five instruction-tuned LLMs. All annotations are natively written in Indonesian with strong inter-annotator agreement, measured by Krippendorff's alpha. Our benchmark spans 10 diverse categories, enabling practitioners to identify LLMs' fine-grained strengths and weaknesses.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.