Estimating Subjective Crowd-Evaluations as an Additional Objective to Improve Natural Language Generation
Jakob Nyberg, Ramesh Manuvinakurike, Maike Paetzel-Pr\"usmann

TL;DR
This paper proposes integrating subjective crowd-evaluations into the training of language generation models through multi-task learning, leading to more human-like and contextually appropriate dialogue outputs.
Contribution
It introduces a novel multi-task learning approach that incorporates subjective human ratings as an explicit training objective for language models.
Findings
Multi-task models received higher subjective ratings for typicality, engagement, and offensiveness.
Subjective ratings can be effectively used as an additional training signal.
Incorporating human evaluations improves the quality of generated dialogue.
Abstract
Human ratings are one of the most prevalent methods to evaluate the performance of natural language processing algorithms. Similarly, it is common to measure the quality of sentences generated by a natural language generation model using human raters. In this paper, we argue for exploring the use of subjective evaluations within the process of training language generation models in a multi-task learning setting. As a case study, we use a crowd-authored dialogue corpus to fine-tune six different language generation models. Two of these models incorporate multi-task learning and use subjective ratings of lines as part of an explicit learning goal. A human evaluation of the generated dialogue lines reveals that utterances generated by the multi-tasking models were subjectively rated as the most typical, most moving the conversation forward, and least offensive. Based on these promising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
