Reinforcement Learning for on-line Sequence Transformation

Grzegorz Rype\'s\'c; {\L}ukasz Lepak; Pawe{\l} Wawrzy\'nski

arXiv:2105.14097·cs.LG·February 17, 2022

Reinforcement Learning for on-line Sequence Transformation

Grzegorz Rype\'s\'c, {\L}ukasz Lepak, Pawe{\l} Wawrzy\'nski

PDF

TL;DR

This paper introduces a reinforcement learning-based architecture for on-line sequence transformation, capable of handling infinite sequences and outperforming some existing methods in real-time translation tasks.

Contribution

It presents a novel reinforcement learning architecture that enables on-line sequence transformation, a capability lacking in previous methods, and demonstrates its effectiveness in neural machine translation.

Findings

01

Outperforms autoencoder with attention in on-line translation

02

Produces slightly worse results than Transformer in translation quality

03

Capable of transforming potentially infinite sequences on-line

Abstract

A number of problems in the processing of sound and natural language, as well as in other areas, can be reduced to simultaneously reading an input sequence and writing an output sequence of generally different length. There are well developed methods that produce the output sequence based on the entirely known input. However, efficient methods that enable such transformations on-line do not exist. In this paper we introduce an architecture that learns with reinforcement to make decisions about whether to read a token or write another token. This architecture is able to transform potentially infinite sequences on-line. In an experimental study we compare it with state-of-the-art methods for neural machine translation. While it produces slightly worse translations than Transformer, it outperforms the autoencoder with attention, even though our architecture translates texts on-line thereby…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing · Layer Normalization · Byte Pair Encoding · Residual Connection · Dropout