Calibrating Sequence likelihood Improves Conditional Language Generation
Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad, Saleh, Peter J. Liu

TL;DR
This paper introduces sequence likelihood calibration (SLiC), a method that improves the ranking of generated sequences in conditional language models, reducing reliance on heuristics and enhancing output quality across various tasks.
Contribution
The paper proposes SLiC, a novel calibration technique that aligns model-generated sequence likelihoods with reference sequences, improving decoding quality without heuristics.
Findings
SLiC improves sequence ranking and quality across tasks.
Decoding heuristics become unnecessary with SLiC.
SLiC maintains effectiveness across different model scales.
Abstract
Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences. While MLE trained models assign high probability to plausible sequences given the context, the model probabilities often do not accurately rank-order generated sequences by quality. This has been empirically observed in beam search decoding as output quality degrading with large beam sizes, and decoding strategies benefiting from heuristics such as length normalization and repetition-blocking. In this work, we introduce sequence likelihood calibration (SLiC) where the likelihood of model generated sequences are calibrated to better align with reference sequences in the model's latent space. With SLiC, decoding heuristics become unnecessary and decoding candidates' quality significantly improves regardless of the decoding method.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsALIGN
