Reinforcement Learning for on-line Sequence Transformation
Grzegorz Rype\'s\'c, {\L}ukasz Lepak, Pawe{\l} Wawrzy\'nski

TL;DR
This paper introduces a reinforcement learning-based architecture for on-line sequence transformation, capable of handling infinite sequences and outperforming some existing methods in real-time translation tasks.
Contribution
It presents a novel reinforcement learning architecture that enables on-line sequence transformation, a capability lacking in previous methods, and demonstrates its effectiveness in neural machine translation.
Findings
Outperforms autoencoder with attention in on-line translation
Produces slightly worse results than Transformer in translation quality
Capable of transforming potentially infinite sequences on-line
Abstract
A number of problems in the processing of sound and natural language, as well as in other areas, can be reduced to simultaneously reading an input sequence and writing an output sequence of generally different length. There are well developed methods that produce the output sequence based on the entirely known input. However, efficient methods that enable such transformations on-line do not exist. In this paper we introduce an architecture that learns with reinforcement to make decisions about whether to read a token or write another token. This architecture is able to transform potentially infinite sequences on-line. In an experimental study we compare it with state-of-the-art methods for neural machine translation. While it produces slightly worse translations than Transformer, it outperforms the autoencoder with attention, even though our architecture translates texts on-line thereby…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing · Layer Normalization · Byte Pair Encoding · Residual Connection · Dropout
