Unsupervised Audio-Visual Subspace Alignment for High-Stakes Deception Detection
Leena Mathur, Maja J Matari\'c

TL;DR
This paper introduces an unsupervised multimodal transfer learning method using subspace alignment to detect high-stakes deception in videos, outperforming non-aligned models and matching supervised approaches without requiring labeled high-stakes data.
Contribution
It presents the first unsupervised approach for high-stakes deception detection that adapts low-stakes training data to real-world scenarios using subspace alignment.
Findings
Unsupervised models outperform non-aligned models.
Models outperform human ability.
Comparable to supervised models.
Abstract
Automated systems that detect deception in high-stakes situations can enhance societal well-being across medical, social work, and legal domains. Existing models for detecting high-stakes deception in videos have been supervised, but labeled datasets to train models can rarely be collected for most real-world applications. To address this problem, we propose the first multimodal unsupervised transfer learning approach that detects real-world, high-stakes deception in videos without using high-stakes labels. Our subspace-alignment (SA) approach adapts audio-visual representations of deception in lab-controlled low-stakes scenarios to detect deception in real-world, high-stakes situations. Our best unsupervised SA models outperform models without SA, outperform human ability, and perform comparably to a number of existing supervised models. Our research demonstrates the potential for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
