Estimating Subjective Crowd-Evaluations as an Additional Objective to   Improve Natural Language Generation

Jakob Nyberg; Ramesh Manuvinakurike; Maike Paetzel-Pr\"usmann

arXiv:2104.05224·cs.CL·April 13, 2021

Estimating Subjective Crowd-Evaluations as an Additional Objective to Improve Natural Language Generation

Jakob Nyberg, Ramesh Manuvinakurike, Maike Paetzel-Pr\"usmann

PDF

Open Access

TL;DR

This paper proposes integrating subjective crowd-evaluations into the training of language generation models through multi-task learning, leading to more human-like and contextually appropriate dialogue outputs.

Contribution

It introduces a novel multi-task learning approach that incorporates subjective human ratings as an explicit training objective for language models.

Findings

01

Multi-task models received higher subjective ratings for typicality, engagement, and offensiveness.

02

Subjective ratings can be effectively used as an additional training signal.

03

Incorporating human evaluations improves the quality of generated dialogue.

Abstract

Human ratings are one of the most prevalent methods to evaluate the performance of natural language processing algorithms. Similarly, it is common to measure the quality of sentences generated by a natural language generation model using human raters. In this paper, we argue for exploring the use of subjective evaluations within the process of training language generation models in a multi-task learning setting. As a case study, we use a crowd-authored dialogue corpus to fine-tune six different language generation models. Two of these models incorporate multi-task learning and use subjective ratings of lines as part of an explicit learning goal. A human evaluation of the generated dialogue lines reveals that utterances generated by the multi-tasking models were subjectively rated as the most typical, most moving the conversation forward, and least offensive. Based on these promising…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications