Bottleneck Low-rank Transformers for Low-resource Spoken Language   Understanding

Pu Wang; Hugo Van hamme

arXiv:2206.14318·cs.CL·June 30, 2022

Bottleneck Low-rank Transformers for Low-resource Spoken Language Understanding

Pu Wang, Hugo Van hamme

PDF

Open Access

TL;DR

This paper introduces a compact low-rank transformer architecture with attention bottlenecks, enabling efficient spoken language understanding in low-resource scenarios without pretraining, matching large model accuracy.

Contribution

It proposes a novel lean transformer structure with attention bottlenecks and group sparsity, reducing model size while maintaining performance in low-resource SLU tasks.

Findings

01

Achieves competitive accuracy without pretraining

02

Reduces model size significantly

03

Effective in low-resource settings

Abstract

End-to-end spoken language understanding (SLU) systems benefit from pretraining on large corpora, followed by fine-tuning on application-specific data. The resulting models are too large for on-edge applications. For instance, BERT-based systems contain over 110M parameters. Observing the model is overparameterized, we propose lean transformer structure where the dimension of the attention mechanism is automatically reduced using group sparsity. We propose a variant where the learned attention subspace is transferred to an attention bottleneck layer. In a low-resource setting and without pre-training, the resulting compact SLU model achieves accuracies competitive with pre-trained large models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Speech Recognition and Synthesis · Topic Modeling