Beyond Acoustic Emotion Recognition: Multimodal Pathos Analysis in Political Speech Using LLM-Based and Acoustic Emotion Models

Juergen Dietrich

arXiv:2605.22732·cs.AI·May 22, 2026

Beyond Acoustic Emotion Recognition: Multimodal Pathos Analysis in Political Speech Using LLM-Based and Acoustic Emotion Models

Juergen Dietrich

PDF

TL;DR

This study compares acoustic emotion recognition models and LLM-based multimodal analysis for political speech emotion detection, finding LLMs better capture semantic political emotions than acoustic models alone.

Contribution

It demonstrates that LLM-based multimodal analysis correlates more strongly with political emotion scores than traditional acoustic models, highlighting the importance of semantic context.

Findings

01

Gemini LLM's Valence correlates strongly with TRUST-Pathos scores (rho=+0.664)

02

Acoustic emotion models show weak correlation with political emotion scores

03

Standard SER datasets have biases and limitations for political speech analysis

Abstract

We investigate whether acoustic emotion recognition models can serve as proxies for the Pathos dimension in political speech analysis, as operationalised by the TRUST multi-agent large language model (LLM) pipeline. Using a Bundestag plenary speech by Felix Banaszak (51 segments, 245 s) as a case study, we compare three analysis modalities: (1) emotion2vec_plus_large, an acoustic speech emotion recognition (SER) model whose continuous Arousal and Valence values are derived via post-hoc Russell Circumplex projection; (2) Gemini 2.5 Flash, an LLM analysing the full speech audio together with its transcript in an open-ended, context-aware fashion; and (3) TRUST-Pathos scores from a three-advocate LLM supervisor ensemble. Spearman rank correlations reveal that Gemini Valence correlates strongly with TRUST-Pathos (rho = +0.664, p < 0.001), whereas emotion2vec Valence does not (rho = +0.097,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.