A Comparative Study of Language Models for Khmer Retrieval-Augmented Question Answering

Sereiwathna Ros; Phannet Pov; Ratanaktepi Chhor; Kimleang Ly; Wan-Sup Cho; and Saksonita Khoeurn

arXiv:2605.22099·cs.CL·May 22, 2026

A Comparative Study of Language Models for Khmer Retrieval-Augmented Question Answering

Sereiwathna Ros, Phannet Pov, Ratanaktepi Chhor, Kimleang Ly, Wan-Sup Cho, and Saksonita Khoeurn

PDF

TL;DR

This study evaluates retrieval-augmented question answering systems for Khmer, comparing different embedding models and generators, revealing that retriever choice significantly impacts performance and no single model excels across all metrics.

Contribution

It provides the first comprehensive benchmark of RAG components for Khmer, highlighting the importance of retriever selection and analyzing generator performance.

Findings

01

BGE-M3 embedding model outperforms others in Khmer document retrieval.

02

No single generator model dominates across all evaluation metrics.

03

Retriever choice is a key bottleneck in Khmer RAG systems.

Abstract

Retrieval-Augmented Generation (RAG) has emerged as a promising paradigm for grounding large language model (LLM) outputs in retrieved evidence, thereby reducing hallucination and improving factual accuracy. Its efficacy, however, remains largely unexamined for low-resource, non-Latin-script languages such as Khmer. In this paper, we present a RAG-based question answering system for Khmer-language telecom-domain documents. We conduct a two-phase comparative evaluation. First, we benchmark three embedding models: BGE-M3 (567M), Jina-Embeddings-v3 (570M), and Qwen3-Embedding (597M), for dense retrieval over Khmer documents. BGE-M3 consistently performs best, achieving a Hit Rate@3 of 0.285, File Hit Rate@3 of 0.700, MRR@3 of 0.221, and Precision@3 of 0.112, substantially outperforming the other retrievers. Second, using BGE-M3 as the selected retriever, we evaluate five generator…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.