The Acoustic Camouflage Phenomenon: Re-evaluating Speech Features for Financial Risk Prediction
Dhruvin Dungrani, Disha Dungrani

TL;DR
This study examines the limitations of acoustic speech features in financial risk prediction, revealing that media-trained vocal regulation can impair multimodal models' performance in high-stakes environments.
Contribution
It demonstrates that acoustic features can negatively impact multimodal models due to media-trained vocal regulation, highlighting a boundary condition for speech-based financial forecasting.
Findings
Acoustic features degraded model recall from 66.25% to 47.08%.
Media-trained vocal regulation introduces noise disrupting multimodal learning.
Identifies Acoustic Camouflage as a key challenge in speech-based risk prediction.
Abstract
In computational paralinguistics, detecting cognitive load and deception from speech signals is a heavily researched domain. Recent efforts have attempted to apply these acoustic frameworks to corporate earnings calls to predict catastrophic stock market volatility. In this study, we empirically investigate the limits of acoustic feature extraction (pitch, jitter, and hesitation) when applied to highly trained speakers in in-the-wild teleconference environments. Utilizing a two-stream late-fusion architecture, we contrast an acoustic-based stream with a baseline Natural Language Processing (NLP) stream. The isolated NLP model achieved a recall of 66.25% for tail-risk downside events. Surprisingly, integrating acoustic features via late fusion significantly degraded performance, reducing recall to 47.08%. We identify this degradation as Acoustic Camouflage, where media-trained vocal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
