Synthetic Speech Classification: IEEE Signal Processing Cup 2022 challenge
Mahieyin Rahmun, Rafat Hasan Khan, Tanjim Taharat Aurpa, Sadia Khan,, Zulker Nayeen Nahiyan, Mir Sayad Bin Almas, Rakibul Hasan Rajib, Syeda Sakira, Hassan

TL;DR
This paper develops a robust synthetic speech classifier for the IEEE Signal Processing Cup 2022 challenge, comparing classical and deep learning methods, and finds deep learning with raw data performs best.
Contribution
It introduces a synthetic speech attribution model using various TTS algorithms and evaluates multiple machine learning and deep learning approaches.
Findings
Deep learning methods outperform classical models.
Raw data-based deep learning models achieve the best accuracy.
The approach effectively distinguishes synthetic speech from different TTS sources.
Abstract
The aim of this project is to implement and design arobust synthetic speech classifier for the IEEE Signal ProcessingCup 2022 challenge. Here, we learn a synthetic speech attributionmodel using the speech generated from various text-to-speech(TTS) algorithms as well as unknown TTS algorithms. Weexperiment with both the classical machine learning methodssuch as support vector machine, Gaussian mixture model, anddeep learning based methods such as ResNet, VGG16, and twoshallow end-to-end networks. We observe that deep learningbased methods with raw data demonstrate the best performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
