Synthetic Speech Classification: IEEE Signal Processing Cup 2022   challenge

Mahieyin Rahmun; Rafat Hasan Khan; Tanjim Taharat Aurpa; Sadia Khan,; Zulker Nayeen Nahiyan; Mir Sayad Bin Almas; Rakibul Hasan Rajib; Syeda Sakira; Hassan

arXiv:2412.13279·cs.SD·December 19, 2024

Synthetic Speech Classification: IEEE Signal Processing Cup 2022 challenge

Mahieyin Rahmun, Rafat Hasan Khan, Tanjim Taharat Aurpa, Sadia Khan,, Zulker Nayeen Nahiyan, Mir Sayad Bin Almas, Rakibul Hasan Rajib, Syeda Sakira, Hassan

PDF

Open Access 1 Repo

TL;DR

This paper develops a robust synthetic speech classifier for the IEEE Signal Processing Cup 2022 challenge, comparing classical and deep learning methods, and finds deep learning with raw data performs best.

Contribution

It introduces a synthetic speech attribution model using various TTS algorithms and evaluates multiple machine learning and deep learning approaches.

Findings

01

Deep learning methods outperform classical models.

02

Raw data-based deep learning models achieve the best accuracy.

03

The approach effectively distinguishes synthetic speech from different TTS sources.

Abstract

The aim of this project is to implement and design arobust synthetic speech classifier for the IEEE Signal ProcessingCup 2022 challenge. Here, we learn a synthetic speech attributionmodel using the speech generated from various text-to-speech(TTS) algorithms as well as unknown TTS algorithms. Weexperiment with both the classical machine learning methodssuch as support vector machine, Gaussian mixture model, anddeep learning based methods such as ResNet, VGG16, and twoshallow end-to-end networks. We observe that deep learningbased methods with raw data demonstrate the best performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AGenCyLab/SPCUP2022
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis