Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation
Shohei Higashiyama, Hiroki Ouchi, Hiroki Teranishi, Hiroyuki Otomo,, Yusuke Ide, Aitaro Yamamoto, Hiroyuki Shindo, Yuki Matsuda, Shoko Wakamiya,, Naoya Inoue, Ikuya Yamada, Taro Watanabe

TL;DR
This paper introduces a Japanese travelogue dataset with detailed geo-entity annotations, including mentions, coreferences, and links, to facilitate research in document-level geoparsing.
Contribution
It provides a new, richly annotated dataset specifically designed for evaluating document-level geoparsing systems in Japanese travelogues.
Findings
Dataset contains 200 travelogue documents with 12,171 geo-entity mentions.
Includes 6,339 coreference clusters and 2,551 linked geo-entities.
Enables improved evaluation of geoparsing models.
Abstract
Geoparsing is a fundamental technique for analyzing geo-entity information in text. We focus on document-level geoparsing, which considers geographic relatedness among geo-entity mentions, and presents a Japanese travelogue dataset designed for evaluating document-level geoparsing systems. Our dataset comprises 200 travelogue documents with rich geo-entity information: 12,171 mentions, 6,339 coreference clusters, and 2,551 geo-entities linked to geo-database entries.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Geographic Information Systems Studies · Natural Language Processing Techniques
