Impact of Decoding Methods on Human Alignment of Conversational LLMs
Shaz Furniturewala, Kokil Jaidka, Yashvardhan Sharma

TL;DR
This paper investigates how different decoding methods affect the alignment of conversational large language models with human speech, introducing new measures and analyzing various datasets to understand the nuances of alignment.
Contribution
It introduces new measures of alignment and systematically analyzes the impact of decoding methods and dataset types on human-LLM conversation alignment.
Findings
Fewer beams in Beam Search improve alignment.
Lower P values in Nucleus Sampling enhance alignment.
Task-oriented and open-ended datasets show different alignment patterns.
Abstract
To be included into chatbot systems, Large language models (LLMs) must be aligned with human conversational conventions. However, being trained mainly on web-scraped data gives existing LLMs a voice closer to informational text than actual human speech. In this paper, we examine the effect of decoding methods on the alignment between LLM-generated and human conversations, including Beam Search, Top K Sampling, and Nucleus Sampling. We present new measures of alignment in substance, style, and psychometric orientation, and experiment with two conversation datasets. Our results provide subtle insights: better alignment is attributed to fewer beams in Beam Search and lower values of P in Nucleus Sampling. We also find that task-oriented and open-ended datasets perform differently in terms of alignment, indicating the significance of taking into account the context of the interaction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Semantic Web and Ontologies
