Table Question Answering for Low-resourced Indic Languages
Vaishali Pal, Evangelos Kanoulas, Andrew Yates, Maarten de Rijke

TL;DR
This paper introduces a scalable data generation method for table question answering in low-resource Indic languages, enabling the development of effective models despite limited annotated data, and demonstrates superior performance over large language models.
Contribution
It presents the first scalable data generation and evaluation approach for low-resource tableQA, specifically applied to Bengali and Hindi, improving model performance.
Findings
Models trained on generated data outperform state-of-the-art LLMs.
The data generation method is applicable to any low-resource language with web presence.
The study analyzes mathematical reasoning and zero-shot transfer capabilities.
Abstract
TableQA is the task of answering questions over tables of structured information, returning individual cells or tables as output. TableQA research has focused primarily on high-resource languages, leaving medium- and low-resource languages with little progress due to scarcity of annotated data and neural models. We address this gap by introducing a fully automatic large-scale tableQA data generation process for low-resource languages with limited budget. We incorporate our data generation method on two Indic languages, Bengali and Hindi, which have no tableQA datasets or models. TableQA models trained on our large-scale datasets outperform state-of-the-art LLMs. We further study the trained models on different aspects, including mathematical reasoning capabilities and zero-shot cross-lingual transfer. Our work is the first on low-resource tableQA focusing on scalable data generation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
