Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models

Hao Yang; Lizhen Qu; Ehsan Shareghi; Gholamreza Haffari

arXiv:2410.23861·cs.CL·November 1, 2024

Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models

Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper evaluates the safety vulnerabilities of five advanced audio multimodal models, revealing significant risks of harmful outputs and safety failures under various attack scenarios, highlighting the need for improved safety measures.

Contribution

It is the first comprehensive red teaming study focusing on the safety of audio large multimodal models across multiple attack settings.

Findings

01

Open-source audio LMMs have a 69.14% attack success rate on harmful audio questions.

02

Models exhibit safety vulnerabilities when distracted by non-speech audio noise.

03

Speech-specific jailbreaks achieve a 70.67% success rate on harmful query benchmarks.

Abstract

Large Multimodal Models (LMMs) have demonstrated the ability to interact with humans under real-world conditions by combining Large Language Models (LLMs) and modality encoders to align multimodal information (visual and auditory) with text. However, such models raise new safety challenges of whether models that are safety-aligned on text also exhibit consistent safeguards for multimodal inputs. Despite recent safety-alignment research on vision LMMs, the safety of audio LMMs remains under-explored. In this work, we comprehensively red team the safety of five advanced audio LMMs under three settings: (i) harmful questions in both audio and text formats, (ii) harmful questions in text format accompanied by distracting non-speech audio, and (iii) speech-specific jailbreaks. Our results under these settings demonstrate that open-source audio LMMs suffer an average attack success rate of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YangHao97/RedteamAudioLMMs
noneOfficial

Videos

Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models· underline

Taxonomy

TopicsMusic and Audio Processing

MethodsALIGN