Do Prompts Really Prompt? Exploring the Prompt Understanding Capability   of Whisper

Chih-Kai Yang; Kuan-Po Huang; Hung-yi Lee

arXiv:2406.05806·cs.CL·September 17, 2024

Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper

Chih-Kai Yang, Kuan-Po Huang, Hung-yi Lee

PDF

Open Access 1 Repo

TL;DR

This study investigates whether Whisper, a speech recognition model, truly understands prompts and how prompt quality affects its performance, revealing unexpected behaviors and limitations in prompt comprehension.

Contribution

The paper provides an empirical analysis of Whisper's prompt understanding, highlighting its limited comprehension and counter-intuitive responses to prompt quality and language cues.

Findings

01

Whisper may not understand prompts as humans do.

02

Performance does not always improve with better prompts.

03

English prompts outperform Mandarin prompts despite training data differences.

Abstract

This research explores how the information of prompts interacts with the high-performing speech recognition model, Whisper. We compare its performances when prompted by prompts with correct information and those corrupted with incorrect information. Our results unexpectedly show that Whisper may not understand the textual prompts in a human-expected way. Additionally, we find that performance improvement is not guaranteed even with stronger adherence to the topic information in textual prompts. It is also noted that English prompts generally outperform Mandarin ones on datasets of both languages, likely due to differences in training data distributions for these languages despite the mismatch with pre-training scenarios. Conversely, we discover that Whisper exhibits awareness of misleading information in language tokens by ignoring incorrect language tokens and focusing on the correct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

b08202033/whisper_prompting
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics in Business and Education