JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and   People-related Information Detection

Changzeng Fu

arXiv:2103.11786·cs.CL·March 23, 2021·1 cites

JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and People-related Information Detection

Changzeng Fu

PDF

Open Access 1 Repo

TL;DR

This paper introduces JPS-daprinfo, a Japanese dialogue dataset with annotated labels for dialog act analysis and people-related information detection, based on 50 interview dialogues totaling over 30 hours.

Contribution

It provides a new annotated Japanese dialogue dataset with 13 labels, specifically designed for dialog act analysis and information detection tasks.

Findings

01

Annotated 20,130 sentences with 13 labels

02

Dataset based on 50 native Japanese interview dialogues

03

Facilitates research in Japanese dialog analysis

Abstract

We conducted a labeling work on a spoken Japanese dataset (I-JAS) for the text classification, which contains 50 interview dialogues of two-way Japanese conversation that discuss the participants' past present and future. Each dialogue is 30 minutes long. From this dataset, we selected the interview dialogues of native Japanese speakers as the samples. Given the dataset, we annotated sentences with 13 labels. The labeling work was conducted by native Japanese speakers who have experiences with data annotation. The total amount of the annotated samples is 20130.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CZFuChason/JPS-daprinfo
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques