Differentiable lower bound for expected BLEU score

Vlad Zhukov; Eugene Golikov; Maksim Kretov

arXiv:1712.04708·cs.CL·August 24, 2018·6 cites

Differentiable lower bound for expected BLEU score

Vlad Zhukov, Eugene Golikov, Maksim Kretov

PDF

Open Access 2 Repos

TL;DR

This paper introduces a differentiable lower bound for the expected BLEU score, enabling gradient-based optimization without costly sampling methods like REINFORCE, thus addressing the loss-evaluation mismatch in NLP tasks.

Contribution

It proposes a novel method to compute a differentiable lower bound of expected BLEU score, improving optimization efficiency in NLP models.

Findings

01

The method provides a computationally efficient alternative to REINFORCE.

02

It effectively bridges the gap between surrogate loss optimization and BLEU score improvement.

03

The approach enhances the training of NLP models by directly optimizing a differentiable approximation of BLEU.

Abstract

In natural language processing tasks performance of the models is often measured with some non-differentiable metric, such as BLEU score. To use efficient gradient-based methods for optimization, it is a common workaround to optimize some surrogate loss function. This approach is effective if optimization of such loss also results in improving target metric. The corresponding problem is referred to as loss-evaluation mismatch. In the present work we propose a method for calculation of differentiable lower bound of expected BLEU score that does not involve computationally expensive sampling procedure such as the one required when using REINFORCE rule from reinforcement learning (RL) framework.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Reinforcement Learning in Robotics · Machine Learning in Healthcare

MethodsREINFORCE