Economy Watchers Survey Provides Datasets and Tasks for Japanese Financial Domain
Masahiro Suzuki, Hiroki Sakaji

TL;DR
This paper introduces comprehensive Japanese financial NLP datasets derived from government data, enabling classification and sentiment analysis tasks, with an automatic update system to keep datasets current.
Contribution
It provides the first large-scale, multi-task Japanese financial NLP datasets with an automatic update framework, filling a critical gap in multilingual financial NLP resources.
Findings
Datasets enable sentence classification and sentiment analysis in Japanese finance.
Automatic update framework ensures datasets stay current.
Facilitates development and evaluation of Japanese financial NLP models.
Abstract
Natural language processing (NLP) tasks in English and general domains are widely available and are often used to evaluate pre-trained language models. In contrast, fewer tasks are available for languages other than English and in the financial domain. Particularly, tasks in the Japanese and financial domains are limited. We develop two large datasets using data published by a Japanese central government agency. The datasets provide three Japanese financial NLP tasks, including 3- and 12-class classifications for categorizing sentences, along with a 5-class classification task for sentiment analysis. Our datasets are designed to be comprehensive and updated by leveraging an automatic update framework that ensures that the latest task datasets are publicly always available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data Technologies and Applications
