Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance
Xi Chen, Nan Ding, Tomer Levinboim, Radu Soricut

TL;DR
This paper introduces two techniques, batch-mean centering and tempered Word Mover Distance, to enhance the accuracy of text generation evaluation metrics using BERT-based representations, achieving better correlation with human judgments.
Contribution
It proposes novel statistical and computational methods to improve similarity metrics for text evaluation, leading to state-of-the-art correlation with human ratings.
Findings
Improved correlation with human judgments across benchmarks.
Enhanced statistical properties of contextualized word representations.
State-of-the-art performance in text evaluation metrics.
Abstract
Recent advances in automatic evaluation metrics for text have shown that deep contextualized word representations, such as those generated by BERT encoders, are helpful for designing metrics that correlate well with human judgements. At the same time, it has been argued that contextualized word representations exhibit sub-optimal statistical properties for encoding the true similarity between words or sentences. In this paper, we present two techniques for improving encoding representations for similarity metrics: a batch-mean centering strategy that improves statistical properties; and a computationally efficient tempered Word Mover Distance, for better fusion of the information in the contextualized word representations. We conduct numerical experiments that demonstrate the robustness of our techniques, reporting results over various BERT-backbone learned metrics and achieving state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Adam · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Dropout · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout
