No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding

Vynska Amalia Permadi; Xingwei Tan; Nafise Sadat Moosavi; Nikos Aletras

arXiv:2602.03709·cs.CL·February 4, 2026

No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding

Vynska Amalia Permadi, Xingwei Tan, Nafise Sadat Moosavi, Nikos Aletras

PDF

Open Access 1 Datasets

TL;DR

This paper introduces ID-MoCQA, a large-scale multi-hop question answering dataset focused on Indonesian cultural knowledge, designed to evaluate and improve the cultural reasoning capabilities of large language models.

Contribution

It presents the first multi-hop cultural QA dataset grounded in Indonesian traditions, with a novel framework for transforming single-hop questions into complex reasoning chains.

Findings

01

State-of-the-art models show significant gaps in cultural reasoning.

02

The dataset reveals challenges in nuanced cultural inference.

03

High-quality questions are ensured through expert review and LLM filtering.

Abstract

Understanding culture requires reasoning across context, tradition, and implicit social knowledge, far beyond recalling isolated facts. Yet most culturally focused question answering (QA) benchmarks rely on single-hop questions, which may allow models to exploit shallow cues rather than demonstrate genuine cultural reasoning. In this work, we introduce ID-MoCQA, the first large-scale multi-hop QA dataset for assessing the cultural understanding of large language models (LLMs), grounded in Indonesian traditions and available in both English and Indonesian. We present a new framework that systematically transforms single-hop cultural questions into multi-hop reasoning chains spanning six clue types (e.g., commonsense, temporal, geographical). Our multi-stage validation pipeline, combining expert review and LLM-as-a-judge filtering, ensures high-quality question-answer pairs. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

vynsk/ID-MoCQA
dataset· 15 dl
15 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Expert finding and Q&A systems