TL;DR
This paper introduces a standardized, open-source benchmarking framework for evaluating deep learning models on audio-visual personality recognition, enabling fair comparison and reproducibility of results across different approaches.
Contribution
It provides the first reproducible benchmark for audio-visual personality recognition, comparing eight models and analyzing the impact of modeling strategies on performance.
Findings
Apparent personality traits are more reliably inferred from facial behaviors.
Visual models generally outperform audio models in personality prediction.
Reproduced models tend to perform worse than originally reported.
Abstract
Personality determines a wide variety of human daily and working behaviours, and is crucial for understanding human internal and external states. In recent years, a large number of automatic personality computing approaches have been developed to predict either the apparent personality or self-reported personality of the subject based on non-verbal audio-visual behaviours. However, the majority of them suffer from complex and dataset-specific pre-processing steps and model training tricks. In the absence of a standardized benchmark with consistent experimental settings, it is not only impossible to fairly compare the real performances of these personality computing models but also makes them difficult to be reproduced. In this paper, we present the first reproducible audio-visual benchmarking framework to provide a fair and consistent evaluation of eight existing personality computing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
