Training IBM Watson using Automatically Generated Question-Answer Pairs

Jangho Lee; Gyuwan Kim; Jaeyoon Yoo; Changwoo Jung; Minseok Kim,; Sungroh Yoon

arXiv:1611.03932·cs.CL·November 15, 2016·1 cites

Training IBM Watson using Automatically Generated Question-Answer Pairs

Jangho Lee, Gyuwan Kim, Jaeyoon Yoo, Changwoo Jung, Minseok Kim,, Sungroh Yoon

PDF

Open Access

TL;DR

This paper explores the use of a large-scale automatically generated question-answer dataset to efficiently train IBM Watson, demonstrating its effectiveness and complementarity to manual data.

Contribution

First to investigate large-scale auto-generated question-answer pairs for training IBM Watson, improving training efficiency and accuracy.

Findings

01

Auto-generated dataset effectively trains Watson

02

Complementary to manual question-answer pairs

03

Enhances training efficiency and accuracy

Abstract

IBM Watson is a cognitive computing system capable of question answering in natural languages. It is believed that IBM Watson can understand large corpora and answer relevant questions more effectively than any other question-answering system currently available. To unleash the full power of Watson, however, we need to train its instance with a large number of well-prepared question-answer pairs. Obviously, manually generating such pairs in a large quantity is prohibitively time consuming and significantly limits the efficiency of Watson's training. Recently, a large-scale dataset of over 30 million question-answer pairs was reported. Under the assumption that using such an automatically generated dataset could relieve the burden of manual question-answer generation, we tried to use this dataset to train an instance of Watson and checked the training efficiency and accuracy. According…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications