I Have an Attention Bridge to Sell You: Generalization Capabilities of   Modular Translation Architectures

Timothee Mickus; Ra\'ul V\'azquez; Joseph Attieh

arXiv:2404.17918·cs.CL·May 1, 2024

I Have an Attention Bridge to Sell You: Generalization Capabilities of Modular Translation Architectures

Timothee Mickus, Ra\'ul V\'azquez, Joseph Attieh

PDF

Open Access 1 Video

TL;DR

This paper investigates whether modular translation architectures, specifically attention bridges, enhance generalization and translation quality, finding that non-modular models often perform better or equally well within the same computational constraints.

Contribution

The study provides a comprehensive comparison of modular and non-modular translation models, highlighting that modularity does not necessarily improve translation quality or generalization.

Findings

01

Non-modular architectures often outperform modular ones at the same computational budget.

02

Modular approaches do not significantly improve translation quality or generalization.

03

Non-modular models are generally comparable or preferable to modular designs.

Abstract

Modularity is a paradigm of machine translation with the potential of bringing forth models that are large at training time and small during inference. Within this field of study, modular approaches, and in particular attention bridges, have been argued to improve the generalization capabilities of models by fostering language-independent representations. In the present paper, we study whether modularity affects translation quality; as well as how well modular architectures generalize across different evaluation scenarios. For a given computational budget, we find non-modular architectures to be always comparable or preferable to all modular designs we study.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

I Have an Attention Bridge to Sell You: Generalization Capabilities of Modular Translation Architectures· underline

Taxonomy

TopicsSemantic Web and Ontologies