Counterfactual Learning for Machine Translation: Degeneracies and Solutions
Carolin Lawrence, Pratik Gajane, Stefan Riezler

TL;DR
This paper investigates counterfactual learning methods for machine translation, focusing on challenges posed by deterministic logging policies and proposing solutions to address estimator degeneracies.
Contribution
It analyzes degeneracies in inverse and reweighted propensity scoring estimators in deterministic settings and relates them to recent counterfactual learning techniques.
Findings
Degeneracies can cause estimator failures in deterministic logging scenarios.
Analysis links estimator issues to recent counterfactual learning methods.
Proposes solutions to mitigate degeneracies in offline machine translation learning.
Abstract
Counterfactual learning is a natural scenario to improve web-based machine translation services by offline learning from feedback logged during user interactions. In order to avoid the risk of showing inferior translations to users, in such scenarios mostly exploration-free deterministic logging policies are in place. We analyze possible degeneracies of inverse and reweighted propensity scoring estimators, in stochastic and deterministic settings, and relate them to recently proposed techniques for counterfactual learning under deterministic logging.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
