SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken   Language Model for Speech Processing Tasks

Kai-Wei Chang; Wei-Cheng Tseng; Shang-Wen Li; Hung-yi Lee

arXiv:2203.16773·eess.AS·July 12, 2022·1 cites

SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks

Kai-Wei Chang, Wei-Cheng Tseng, Shang-Wen Li, Hung-yi Lee

PDF

Open Access 1 Repo

TL;DR

This paper explores prompt tuning for generative spoken language models, demonstrating competitive speech classification performance with fewer parameters and discussing its potential in sequence generation tasks.

Contribution

It is the first to investigate prompt tuning in speech processing with generative spoken language models, showing efficiency and promising results.

Findings

01

Prompt tuning achieves competitive classification accuracy.

02

Fewer trainable parameters compared to fine-tuning.

03

Potential in sequence generation tasks.

Abstract

Speech representations learned from Self-supervised learning (SSL) models can benefit various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific downstream models and loss functions, causing much memory usage and human labor. Recently, prompting in Natural Language Processing (NLP) has been found to be an efficient technique to leverage pre-trained language models (LMs). Specifically, prompt tuning optimizes a limited number of task-specific parameters with a fixed pre-trained model; as a result, only a small set of parameters is needed to be stored for each task. Prompt tuning improves computation and memory efficiency by leveraging the pre-trained LM's prediction ability. Nevertheless, such a paradigm is little studied in the speech community. We report in this paper the first exploration of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ga642381/SpeechPrompt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques