Understanding QA generation: Extracting Parametric and Contextual Knowledge with CQA for Low Resource Bangla Language
Umme Abira Azmary, MD Ikramul Kayes, Swakkhar Shatabda, Farig Yousuf Sadeque

TL;DR
This paper introduces BanglaCQA, a novel counterfactual QA dataset for Bangla, and analyzes how models utilize parametric versus contextual knowledge, revealing the effectiveness of Chain-of-Thought prompting in low-resource settings.
Contribution
It presents the first counterfactual QA dataset for Bangla and proposes methods to disentangle parametric and contextual knowledge in low-resource language models.
Findings
Chain-of-Thought prompting effectively extracts parametric knowledge in counterfactual scenarios.
Decoder-only LLMs perform better with Chain-of-Thought prompting in low-resource Bangla QA.
The BanglaCQA dataset enables detailed analysis of knowledge reliance in low-resource language models.
Abstract
Question-Answering (QA) models for low-resource languages like Bangla face challenges due to limited annotated data and linguistic complexity. A key issue is determining whether models rely more on pre-encoded (parametric) knowledge or contextual input during answer generation, as existing Bangla QA datasets lack the structure required for such analysis. We introduce BanglaCQA, the first Counterfactual QA dataset in Bangla, by extending a Bangla dataset while integrating counterfactual passages and answerability annotations. In addition, we propose fine-tuned pipelines for encoder-decoder language-specific and multilingual baseline models, and prompting-based pipelines for decoder-only LLMs to disentangle parametric and contextual knowledge in both factual and counterfactual scenarios. Furthermore, we apply LLM-based and human evaluation techniques that measure answer quality based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · ICT in Developing Communities
