VANPY: Voice Analysis Framework
Gregory Koushnir, Michael Fire, Galit Fuhrmann Alpert, Dima Kagan

TL;DR
VANPY is an open-source Python framework that automates voice data processing, feature extraction, and classification, enabling comprehensive speaker characterization for various applications including emotion and demographic analysis.
Contribution
The paper introduces VANPY, a modular, extensible framework with new in-house components for detailed voice-based speaker attribute classification.
Findings
Robust performance of VANPY components across datasets
Successful extraction of multiple speaker characteristics from movie voices
Framework demonstrates versatility in voice analysis tasks
Abstract
Voice data is increasingly being used in modern digital communications, yet there is still a lack of comprehensive tools for automated voice analysis and characterization. To this end, we developed the VANPY (Voice Analysis in Python) framework for automated pre-processing, feature extraction, and classification of voice data. The VANPY is an open-source end-to-end comprehensive framework that was developed for the purpose of speaker characterization from voice data. The framework is designed with extensibility in mind, allowing for easy integration of new components and adaptation to various voice analysis applications. It currently incorporates over fifteen voice analysis components - including music/speech separation, voice activity detection, speaker embedding, vocal feature extraction, and various classification models. Four of the VANPY's components were developed in-house and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗griko/gender_cls_svm_ecapa_voxcelebmodel· 2.3k dl2.3k dl
- 🤗griko/height_reg_svr_ecapa_voxcelebmodel· 2 dl2 dl
- 🤗griko/age_reg_svr_ecapa_voxceleb2model· 5 dl5 dl
- 🤗griko/age_reg_svr_ecapa_librosa_voxceleb2model· 7 dl7 dl
- 🤗griko/age_reg_ann_ecapa_timitmodel· 17 dl17 dl
- 🤗griko/age_reg_ann_ecapa_librosa_combinedmodel· 65 dl· ♡ 165 dl♡ 1
- 🤗griko/emotion_7_cls_svm_ecapa_ravdessmodel· 14 dl· ♡ 114 dl♡ 1
- 🤗griko/emotion_8_cls_svm_ecapa_ravdessmodel· 1 dl· ♡ 11 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems
