Towards End-to-End In-Image Neural Machine Translation

Elman Mansimov; Mitchell Stern; Mia Chen; Orhan Firat; Jakob; Uszkoreit; Puneet Jain

arXiv:2010.10648·cs.CL·October 22, 2020

Towards End-to-End In-Image Neural Machine Translation

Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob, Uszkoreit, Puneet Jain

PDF

TL;DR

This paper introduces an end-to-end neural model for in-image machine translation, converting images with text from one language to another, demonstrating promising initial results based on pixel-level supervision.

Contribution

It proposes a novel neural approach for in-image translation directly from pixels, a task not extensively explored before.

Findings

01

Promising initial results achieved

02

System evaluated both quantitatively and qualitatively

03

Discussion of common failure modes included

Abstract

In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language. We propose an end-to-end neural model for this task inspired by recent approaches to neural machine translation, and demonstrate promising initial results based purely on pixel-level supervision. We then offer a quantitative and qualitative evaluation of our system outputs and discuss some common failure modes. Finally, we conclude with directions for future work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.