Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch
Donglin Di, Weinan Zhang, Yue Zhang, Fanglin Wang

TL;DR
This paper introduces a novel framework for building dialogue understanding models in low-resource Indonesian by leveraging cross-lingual transfer from English, demonstrating cost-effective methods and releasing valuable datasets for future research.
Contribution
The paper proposes the BiCF framework for cross-lingual transfer in low-resource dialogue tasks and provides extensive experiments and datasets to support this approach.
Findings
BiCF framework achieves reliable performance with minimal Indonesian data
Extensive experiments validate cost-efficiency of the proposed methods
Release of large-scale Indonesian dialogue datasets for future research
Abstract
Making use of off-the-shelf resources of resource-rich languages to transfer knowledge for low-resource languages raises much attention recently. The requirements of enabling the model to reach the reliable performance lack well guided, such as the scale of required annotated data or the effective framework. To investigate the first question, we empirically investigate the cost-effectiveness of several methods to train the intent classification and slot-filling models for Indonesia (ID) from scratch by utilizing the English data. Confronting the second challenge, we propose a Bi-Confidence-Frequency Cross-Lingual transfer framework (BiCF), composed by ``BiCF Mixing'', ``Latent Space Refinement'' and ``Joint Decoder'', respectively, to tackle the obstacle of lacking low-resource language dialogue data. Extensive experiments demonstrate our framework performs reliably and cost-efficiently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems
MethodsSoftmax · Attention Is All You Need
