Acquiring Correct Knowledge for Natural Language Generation

E. Reiter; R. Robertson; S. G. Sripada

arXiv:1106.5264·cs.CL·June 28, 2011

Acquiring Correct Knowledge for Natural Language Generation

E. Reiter, R. Robertson, S. G. Sripada

PDF

TL;DR

This paper discusses the challenges in acquiring correct knowledge for natural language generation systems, highlighting the limitations of existing techniques and suggesting combined approaches to improve knowledge accuracy.

Contribution

It identifies key problems in knowledge acquisition for NLG and proposes using mixed techniques to mitigate issues caused by complexity and variability.

Findings

01

Corpus-based KA faces quality and consistency issues.

02

Expert-oriented KA suffers from disagreement and limited data.

03

Combining multiple KA techniques can improve knowledge correctness.

Abstract

Natural language generation (NLG) systems are computer software systems that produce texts in English and other human languages, often from non-linguistic input data. NLG systems, like most AI systems, need substantial amounts of knowledge. However, our experience in two NLG projects suggests that it is difficult to acquire correct knowledge for NLG systems; indeed, every knowledge acquisition (KA) technique we tried had significant problems. In general terms, these problems were due to the complexity, novelty, and poorly understood nature of the tasks our systems attempted, and were worsened by the fact that people write so differently. This meant in particular that corpus-based KA approaches suffered because it was impossible to assemble a sizable corpus of high-quality consistent manually written texts in our domains; and structured expert-oriented KA techniques suffered because…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.