Crossing Language Borders: A Pipeline for Indonesian Manhwa Translation
Nithyasri Narasimhan, Sagarika Singh

TL;DR
This paper presents an automated pipeline combining computer vision and NLP techniques to translate Indonesian Manhwa into English, addressing low-resource language challenges and improving accessibility.
Contribution
It introduces a novel automated translation pipeline specifically designed for Indonesian Manhwa, integrating speech bubble detection, OCR, and machine translation.
Findings
Effective speech bubble detection with YOLOv5xu
Accurate OCR with Tesseract for Indonesian text
Successful translation from Indonesian to English
Abstract
In this project, we develop a practical and efficient solution for automating the Manhwa translation from Indonesian to English. Our approach combines computer vision, text recognition, and natural language processing techniques to streamline the traditionally manual process of Manhwa(Korean comics) translation. The pipeline includes fine-tuned YOLOv5xu for speech bubble detection, Tesseract for OCR and fine-tuned MarianMT for machine translation. By automating these steps, we aim to make Manhwa more accessible to a global audience while saving time and effort compared to manual translation methods. While most Manhwa translation efforts focus on Japanese-to-English, we focus on Indonesian-to-English translation to address the challenges of working with low-resource languages. Our model shows good results at each step and was able to translate from Indonesian to English efficiently.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsFocus
