MAB-DQA: Addressing Query Aspect Importance in Document Question Answering with Multi-Armed Bandits

Yixin Xiang; Yunshan Ma; Xiaoyu Du; Yibing Chen; Yanxin Zhang; Jinhui Tang

arXiv:2604.08952·cs.CL·April 17, 2026

MAB-DQA: Addressing Query Aspect Importance in Document Question Answering with Multi-Armed Bandits

Yixin Xiang, Yunshan Ma, Xiaoyu Du, Yibing Chen, Yanxin Zhang, Jinhui Tang

PDF

1 Repo

TL;DR

MAB-DQA introduces a multi-armed bandit approach to improve document question answering by dynamically prioritizing query aspects, leading to significant performance gains on multiple benchmarks.

Contribution

It proposes a novel aspect-aware retrieval framework using multi-armed bandits to better utilize multiple implicit query aspects in multimodal DQA.

Findings

01

Achieves 5%-18% improvement over state-of-the-art methods.

02

Effectively models query aspect importance for better retrieval.

03

Enhances document understanding in multimodal DQA.

Abstract

Document Question Answering (DQA) involves generating answers from a document based on a user's query, representing a key task in document understanding. This task requires interpreting visual layouts, which has prompted recent studies to adopt multimodal Retrieval-Augmented Generation (RAG) that processes page images for answer generation. However, in multimodal RAG, visual DQA struggles to utilize a large number of images effectively, as the retrieval stage often retains only a few candidate pages (e.g., Top-4), causing informative but less visually salient content to be overlooked in favor of common yet low-information pages. To address this issue, we propose a Multi-Armed Bandit-based DQA framework (MAB-DQA) to explicitly model the varying importance of multiple implicit aspects in a query. Specifically, MAB-DQA decomposes a query into aspect-aware subqueries and retrieves an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ElephantOH/MAB-DQA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.