Towards Fully Automated Manga Translation
Ryota Hinami, Shonosuke Ishiwatari, Kazuhiko Yasuda, and Yusuke Matsui

TL;DR
This paper introduces a multimodal, context-aware manga translation framework that leverages image context, automatically constructs training data, and establishes benchmarks, enabling fully automated translation of manga comics.
Contribution
It presents the first multimodal translation model incorporating manga image context, automatic corpus creation method, new evaluation benchmarks, and a comprehensive automated translation system.
Findings
First to incorporate image context into manga translation
Automatic corpus construction from manga and translations
Established benchmarks for manga translation
Abstract
We tackle the problem of machine translation of manga, Japanese comics. Manga translation involves two important problems in machine translation: context-aware and multimodal translation. Since text and images are mixed up in an unstructured fashion in Manga, obtaining context from the image is essential for manga translation. However, it is still an open problem how to extract context from image and integrate into MT models. In addition, corpus and benchmarks to train and evaluate such model is currently unavailable. In this paper, we make the following four contributions that establishes the foundation of manga translation research. First, we propose multimodal context-aware translation framework. We are the first to incorporate context information obtained from manga image. It enables us to translate texts in speech bubbles that cannot be translated without using context information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Video Analysis and Summarization
