Conversational Machine Reading Comprehension for Vietnamese Healthcare   Texts

Son T. Luu; Mao Nguyen Bui; Loi Duc Nguyen; Khiem Vinh Tran; Kiet Van; Nguyen; Ngan Luu-Thuy Nguyen

arXiv:2105.01542·cs.CL·October 1, 2021

Conversational Machine Reading Comprehension for Vietnamese Healthcare Texts

Son T. Luu, Mao Nguyen Bui, Loi Duc Nguyen, Khiem Vinh Tran, Kiet Van, Nguyen, Ngan Luu-Thuy Nguyen

PDF

1 Repo

TL;DR

This paper introduces UIT-ViCoQA, a new Vietnamese conversational machine reading comprehension dataset focused on health news, and evaluates baseline models, highlighting significant room for improvement.

Contribution

The paper presents a novel Vietnamese conversational MRC dataset with 10,000 questions over 2,000 health conversations and provides baseline evaluations.

Findings

01

Best model F1 score of 45.27%

02

Human performance F1 score of 76.18%

03

Substantial gap indicates need for further research

Abstract

Machine reading comprehension (MRC) is a sub-field in natural language processing that aims to assist computers understand unstructured texts and then answer questions related to them. In practice, the conversation is an essential way to communicate and transfer information. To help machines understand conversation texts, we present UIT-ViCoQA, a new corpus for conversational machine reading comprehension in the Vietnamese language. This corpus consists of 10,000 questions with answers over 2,000 conversations about health news articles. Then, we evaluate several baseline approaches for conversational machine comprehension on the UIT-ViCoQA corpus. The best model obtains an F1 score of 45.27%, which is 30.91 points behind human performance (76.18%), indicating that there is ample room for improvement. Our dataset is available at our website: http://nlp.uit.edu.vn/datasets/ for research…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sonlam1102/vicoqa-cmc
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.