ATD-Trans: A Geographically Grounded Japanese-English Travelogue Translation Dataset
Shohei Higashiyama, Hiroki Ouchi, Atsushi Fujita, Masao Utiyama

TL;DR
This paper introduces ATD-Trans, a Japanese-English travelogue translation dataset that enables detailed evaluation of machine translation quality at geo-entity levels across regions.
Contribution
It provides a new geographically grounded translation dataset for Japanese-English travel texts, facilitating analysis of MT performance on geo-entities in different regions.
Findings
Japanese-enhanced models perform better in translation tasks.
Translating domestic-region geo-entities is more challenging.
The dataset supports evaluation at both overall and geo-entity levels.
Abstract
Geographic text, or textual data rich in geographic (geo-) information is a valuable source for various geographic applications, e.g., tourism management. Making such information accessible to speakers of other languages further enhances its utility; thus, accurate machine translation (MT) is essential for equity in multilingual geo-information access. To facilitate in-depth analysis for geographic text, we introduce ATD-Trans, a geographically grounded Japanese--English travelogue translation dataset, which enables evaluation of MT quality at both the overall and geo-entity levels across domestic (within Japan) and overseas regions. Our experiments on existing language models examine two factors: model language focus and geographic regions. The results highlight advantages of Japanese-enhanced models and greater difficulty in translating domestic-region geo-entities mentioned in travel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
