Distributional Reinforcement Learning for Energy-Based Sequential Models

Tetiana Parshakova; Jean-Marc Andreoli; Marc Dymetman

arXiv:1912.08517·cs.LG·December 19, 2019·6 cites

Distributional Reinforcement Learning for Energy-Based Sequential Models

Tetiana Parshakova, Jean-Marc Andreoli, Marc Dymetman

PDF

Open Access 1 Repo

TL;DR

This paper introduces a distributional reinforcement learning approach for energy-based sequence models, enhancing their sampling and learning capabilities, demonstrated through experiments with global autoregressive models.

Contribution

It proposes a novel distributional RL method for sequential energy-based models, overcoming limitations of previous distillation techniques and broadening applicability.

Findings

01

Effective in improving sequence sampling

02

Applicable to various energy-based models

03

Demonstrated on GAM-based experiments

Abstract

Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models. In the first phase of training, an Energy-Based model (EBM) over sequences is derived. This EBM has high representational power, but is unnormalized and cannot be directly exploited for sampling. To address this issue [Parshakova et al., CoNLL 2019] proposes a distillation technique, which can only be applied under limited conditions. By relating this problem to Policy Gradient techniques in RL, but in a \emph{distributional} rather than \emph{optimization} perspective, we propose a general approach applicable to any sequential EBM. Its effectiveness is illustrated on GAM-based experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

parshakova/GAMS-for-Data-Efficient-Learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning

Methodsenergy-based model · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence