Distributional Reinforcement Learning for Energy-Based Sequential Models
Tetiana Parshakova, Jean-Marc Andreoli, Marc Dymetman

TL;DR
This paper introduces a distributional reinforcement learning approach for energy-based sequence models, enhancing their sampling and learning capabilities, demonstrated through experiments with global autoregressive models.
Contribution
It proposes a novel distributional RL method for sequential energy-based models, overcoming limitations of previous distillation techniques and broadening applicability.
Findings
Effective in improving sequence sampling
Applicable to various energy-based models
Demonstrated on GAM-based experiments
Abstract
Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models. In the first phase of training, an Energy-Based model (EBM) over sequences is derived. This EBM has high representational power, but is unnormalized and cannot be directly exploited for sampling. To address this issue [Parshakova et al., CoNLL 2019] proposes a distillation technique, which can only be applied under limited conditions. By relating this problem to Policy Gradient techniques in RL, but in a \emph{distributional} rather than \emph{optimization} perspective, we propose a general approach applicable to any sequential EBM. Its effectiveness is illustrated on GAM-based experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning
Methodsenergy-based model · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
