Machines Getting with the Program: Understanding Intent Arguments of   Non-Canonical Directives

Won Ik Cho; Young Ki Moon; Sangwhan Moon; Seok Min Kim; Nam Soo Kim

arXiv:1912.00342·cs.CL·October 8, 2020

Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives

Won Ik Cho, Young Ki Moon, Sangwhan Moon, Seok Min Kim, Nam Soo Kim

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces a new Korean corpus of 50K question/command-intent pairs for understanding non-canonical directives in dialogue systems, along with a method to address class imbalance and extend to multiple languages.

Contribution

It presents a novel corpus creation guideline, a large Korean dataset, and a method for mitigating class imbalance in intent classification for non-canonical speech.

Findings

01

Constructed a 50K instance Korean intent dataset

02

Proposed a class imbalance mitigation method

03

Demonstrated potential for multilingual extension

Abstract

Modern dialog managers face the challenge of having to fulfill human-level conversational skills as part of common user expectations, including but not limited to discourse with no clear objective. Along with these requirements, agents are expected to extrapolate intent from the user's dialogue even when subjected to non-canonical forms of speech. This depends on the agent's comprehension of paraphrased forms of such utterances. Especially in low-resource languages, the lack of data is a bottleneck that prevents advancements of the comprehension performance for these types of agents. In this regard, here we demonstrate the necessity of extracting the intent argument of non-canonical directives in a natural language format, which may yield more accurate parsing, and suggest guidelines for building a parallel corpus for this purpose. Following the guidelines, we construct a Korean corpus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

warnikchow/sae4k
noneOfficial

Datasets

wicho/kor_sae
dataset· 35 dl
35 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.