Multimodal Belief Prediction
John Murzaku, Adil Soubki, Owen Rambow

TL;DR
This paper introduces the first multimodal approach to belief prediction, combining text and audio cues to better understand speaker commitment, and demonstrates improved performance over unimodal methods using the CB-Prosody corpus.
Contribution
It presents a novel multimodal belief prediction framework that integrates text and audio data, outperforming unimodal baselines and establishing new benchmarks.
Findings
Multimodal approach improves belief prediction accuracy.
Audio and text modalities complement each other effectively.
Fusion methods enhance model performance over single modalities.
Abstract
Recognizing a speaker's level of commitment to a belief is a difficult task; humans do not only interpret the meaning of the words in context, but also understand cues from intonation and other aspects of the audio signal. Many papers and corpora in the NLP community have approached the belief prediction task using text-only approaches. We are the first to frame and present results on the multimodal belief prediction task. We use the CB-Prosody corpus (CBP), containing aligned text and audio with speaker belief annotations. We first report baselines and significant features using acoustic-prosodic features and traditional machine learning methods. We then present text and audio baselines for the CBP corpus fine-tuning on BERT and Whisper respectively. Finally, we present our multimodal architecture which fine-tunes on BERT and Whisper and uses multiple fusion methods, improving on both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Advanced Computational Techniques and Applications · Bayesian Modeling and Causal Inference
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Linear Warmup With Linear Decay · Adam · Attention Dropout · Weight Decay · Linear Layer · Multi-Head Attention · Dropout
