On the Language and Gender Biases in PSTN, VoIP and Neural Audio Codecs
Kemal Altwlkany, Amar Kuric, Emanuel Lacic

TL;DR
This paper investigates language and gender biases in various audio codecs, revealing that PSTN codecs are gender-biased and neural codecs exhibit language biases, which can impact fairness in speech technology applications.
Contribution
It provides the first comprehensive analysis of language and gender biases in PSTN, VoIP, and neural audio codecs using a large multilingual dataset.
Findings
PSTN codecs are strongly gender-biased.
Neural codecs introduce language biases.
Biases affect fairness in speech processing.
Abstract
In recent years, there has been a growing focus on fairness and inclusivity within speech technology, particularly in areas such as automatic speech recognition and speech sentiment analysis. When audio is transcoded prior to processing, as is the case in streaming or real-time applications, any inherent bias in the coding mechanism may result in disparities. This not only affects user experience but can also have broader societal implications by perpetuating stereotypes and exclusion. Thus, it is important that audio coding mechanisms are unbiased. In this work, we contribute towards the scarce research with respect to language and gender biases of audio codecs. By analyzing the speech quality of over 2 million multilingual audio files after transcoding through a representative subset of codecs (PSTN, VoIP and neural), our results indicate that PSTN codecs are strongly biased in terms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
