Fractal Dimension Pattern Based Multiresolution Analysis for Rough Estimator of Person-Dependent Audio Emotion Recognition
Miao Cheng, Ah Chung Tsoi

TL;DR
This paper introduces a novel multiresolution analysis method using fractal dimension patterns to improve person-dependent audio emotion recognition, capturing intrinsic auditory emotional features for better classification.
Contribution
It proposes a new approach combining fractal dimension features with multiresolution analysis for more effective person-dependent audio emotion recognition.
Findings
Achieved comparable performance with existing methods
Effectively captures intrinsic emotional features from audio signals
Demonstrated robustness across different speakers
Abstract
As a general means of expression, audio analysis and recognition has attracted much attentions for its wide applications in real-life world. Audio emotion recognition (AER) attempts to understand emotional states of human with the given utterance signals, and has been studied abroad for its further development on friendly human-machine interfaces. Distinguish from other existing works, the person-dependent patterns of audio emotions are conducted, and fractal dimension features are calculated for acoustic feature extraction. Furthermore, it is able to efficiently learn intrinsic characteristics of auditory emotions, while the utterance features are learned from fractal dimensions of each sub-bands. Experimental results show the proposed method is able to provide comparative performance for audio emotion recognition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Image and Signal Denoising Methods
