ViMMRC 2.0 -- Enhancing Machine Reading Comprehension on Vietnamese Literature Text
Son T. Luu, Khoi Trong Hoang, Tuong Quang Pham, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

TL;DR
This paper introduces ViMMRC 2.0, a challenging Vietnamese reading comprehension dataset with a multi-stage model that improves understanding of implicit context, aiming to advance AI comprehension of Vietnamese texts.
Contribution
The paper presents a new, more difficult Vietnamese reading comprehension dataset and a multi-stage attention-based model that outperforms previous baselines.
Findings
The dataset contains 699 passages and 5,273 questions with increased difficulty.
The proposed model improves accuracy over baseline BERT models.
Understanding implicit context remains a key challenge for models.
Abstract
Machine reading comprehension has been an interesting and challenging task in recent years, with the purpose of extracting useful information from texts. To attain the computer ability to understand the reading text and answer relevant information, we introduce ViMMRC 2.0 - an extension of the previous ViMMRC for the task of multiple-choice reading comprehension in Vietnamese Textbooks which contain the reading articles for students from Grade 1 to Grade 12. This dataset has 699 reading passages which are prose and poems, and 5,273 questions. The questions in the new dataset are not fixed with four options as in the previous version. Moreover, the difficulty of questions is increased, which challenges the models to find the correct choice. The computer must understand the whole context of the reading passage, the question, and the content of each choice to extract the right answers.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Online Learning and Analytics · Topic Modeling
MethodsTest
