Encoding Sentence Position in Context-Aware Neural Machine Translation   with Concatenation

Lorenzo Lupo; Marco Dinarelli; Laurent Besacier

arXiv:2302.06459·cs.CL·April 6, 2023

Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation

Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

PDF

Open Access 1 Repo

TL;DR

This paper explores methods for encoding sentence positions in context-aware neural machine translation, demonstrating benefits in English-Russian translation with specific encoding and training strategies, but not in English-German.

Contribution

It introduces and compares various sentence position encoding methods within Transformer models for context-aware translation, highlighting their effectiveness under certain training conditions.

Findings

01

Sentence position encoding improves English-Russian translation quality.

02

Benefits depend on training with a context-discounted loss.

03

No significant improvement observed in English-German translation.

Abstract

Context-aware translation can be achieved by processing a concatenation of consecutive sentences with the standard Transformer architecture. This paper investigates the intuitive idea of providing the model with explicit information about the position of the sentences contained in the concatenation window. We compare various methods to encode sentence positions into token representations, including novel methods. Our results show that the Transformer benefits from certain sentence position encoding methods on English to Russian translation if trained with a context-discounted loss (Lupo et al., 2022). However, the same benefits are not observed in English to German. Further empirical efforts are necessary to define the conditions under which the proposed approach is beneficial.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lorelupo/focused-concat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Dropout · Layer Normalization · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax