V.O.I.C.E (Voice, Ownership, Identity, Control, Expression): Risk Taxonomy of Synthetic Voice Generation From Empirical Data
Tanusree Sharma, Anish Krishnagiri, Lili Dudas, Ahmed Adnan, Visar Berisha

TL;DR
This paper introduces V.O.I.C.E, a comprehensive taxonomy of risks associated with synthetic voice generation, based on extensive empirical data from incidents, reports, and discussions, highlighting privacy and security concerns.
Contribution
It provides a data-driven, nuanced taxonomy of voice generation risks that considers real-world incidents and contextual factors, filling a gap in existing threat models.
Findings
Identified 569 incidents from major AI incident databases.
Analyzed 1067 incident reports from diverse U.S. groups.
Examined 2,221 Reddit discussions related to voice risks.
Abstract
As generative voice models are rapidly advancing in both capabilities and public utilization, the unconsented collection, reuse, and synthesis of voice data are introducing new classes of privacy, security and governance risk that are poorly captured by existing, largely uniform threat models. To fill the gap, we present V.O.I.C.E, a taxonomy of voice generation risk grounded in a multi-source threat modeling effort with 569 incidents from major AI incident database, FTC and Internet Crime Complaint Center (IC3); 1067 direct incident reports from U.S. based participants across diverse groups (including voice actors, internet personalities, political personnel, and general public); and 2,221 Reddit discussions. Grounded in real-world data, our taxonomy explicitly models how risk emerges, interact with contextual factors such as degree of exposure, social visibility, and the availability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
