Improving End-to-End Text Image Translation From the Auxiliary Text   Translation Task

Cong Ma; Yaping Zhang; Mei Tu; Xu Han; Linghui Wu; Yang Zhao; Yu Zhou

arXiv:2210.03887·cs.CL·October 11, 2022

Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task

Cong Ma, Yaping Zhang, Mei Tu, Xu Han, Linghui Wu, Yang Zhao, Yu Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multi-task learning approach for text image translation that leverages auxiliary text translation and recognition tasks to improve performance, effectively utilizing large-scale text corpora.

Contribution

The novel integration of text translation as an auxiliary task in end-to-end text image translation enhances translation accuracy by sharing parameters and exploiting related tasks.

Findings

01

Outperforms existing end-to-end methods in experiments

02

Joint multi-task learning improves translation and recognition results

03

Auxiliary tasks are complementary and beneficial

Abstract

End-to-end text image translation (TIT), which aims at translating the source language embedded in images to the target language, has attracted intensive attention in recent research. However, data sparsity limits the performance of end-to-end text image translation. Multi-task learning is a non-trivial way to alleviate this problem via exploring knowledge from complementary related tasks. In this paper, we propose a novel text translation enhanced text image translation, which trains the end-to-end model with text translation as an auxiliary task. By sharing model parameters and multi-task training, our model is able to take full advantage of easily-available large-scale text parallel corpus. Extensive experimental results show our proposed method outperforms existing end-to-end methods, and the joint multi-task learning with both text translation and recognition tasks achieves better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

EriCongMa/E2E_TIT_With_MT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Multimodal Machine Learning Applications