Design and Development of Laughter Recognition System Based on Multimodal Fusion and Deep Learning
Fuzheng Zhao, Yu Bai

TL;DR
This paper presents a multimodal deep learning system for laughter recognition that combines facial and audio features, achieving 80% accuracy and demonstrating potential in affective computing and HCI applications.
Contribution
It introduces a novel multimodal fusion approach using deep learning for laughter recognition, integrating image and audio processing techniques.
Findings
Achieved 80% accuracy, precision, and recall.
Demonstrated robustness in real-world data variability.
Validated effectiveness of multimodal fusion in laughter recognition.
Abstract
This study aims to design and implement a laughter recognition system based on multimodal fusion and deep learning, leveraging image and audio processing technologies to achieve accurate laughter recognition and emotion analysis. First, the system loads video files and uses the OpenCV library to extract facial information while employing the Librosa library to process audio features such as MFCC. Then, multimodal fusion techniques are used to integrate image and audio features, followed by training and prediction using deep learning models. Evaluation results indicate that the model achieved 80% accuracy, precision, and recall on the test dataset, with an F1 score of 80%, demonstrating robust performance and the ability to handle real-world data variability. This study not only verifies the effectiveness of multimodal fusion methods in laughter recognition but also highlights their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Pedagogy
MethodsLib · Focus
