The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge
Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke, Kashiwagi, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe

TL;DR
This paper presents a pipeline system combining ASR and NLU with MLM-based data augmentation for low-resource spoken language understanding, achieving top results in the ICASSP challenge.
Contribution
It introduces MLM-based data augmentation and retrieval techniques to improve low-resource domain adaptation in spoken language understanding.
Findings
Achieved 69.15% average exact match accuracy in the challenge
Won 1st place in the ICASSP Signal Processing Grand Challenge
Effective domain adaptation with MLM and retrieval methods
Abstract
This paper describes our system for the low-resource domain adaptation track (Track 3) in Spoken Language Understanding Grand Challenge, which is a part of ICASSP Signal Processing Grand Challenge 2023. In the track, we adopt a pipeline approach of ASR and NLU. For ASR, we fine-tune Whisper for each domain with upsampling. For NLU, we fine-tune BART on all the Track3 data and then on low-resource domain data. We apply masked LM (MLM) -based data augmentation, where some of input tokens and corresponding target labels are replaced using MLM. We also apply a retrieval-based approach, where model input is augmented with similar training samples. As a result, we achieved exact match (EM) accuracy 63.3/75.0 (average: 69.15) for reminder/weather domain, and won the 1st place at the challenge.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Adam · Layer Normalization · Linear Layer · Dropout · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection
