I'm Fine, But My Voice Isn't: Cross-Modal Affective Dissonance Detection for Reflective Journaling

Sumin Lee

arXiv:2604.27517·cs.HC·May 1, 2026

I'm Fine, But My Voice Isn't: Cross-Modal Affective Dissonance Detection for Reflective Journaling

Sumin Lee

PDF

TL;DR

This paper introduces a novel cross-modal affective dissonance detection framework for reflective journaling, including a new dataset, a dual-encoder model with asymmetric attention, and insights into domain gaps for real-world application.

Contribution

It formalizes cross-modal affective dissonance detection, creates a new dataset, proposes a dual-encoder model with asymmetric attention, and evaluates domain gaps for naturalistic speech analysis.

Findings

01

DACM achieves macro-F1 0.711 on affective dissonance detection.

02

Asymmetric cross-modal attention significantly improves performance.

03

A substantial domain gap exists between TTS-trained models and real speech.

Abstract

Digital journaling creates an authenticity gap: users consciously translate raw emotions into text, often sanitizing narratives even in private writing. We formalize this as Cross-Modal Affective Dissonance Detection (CADD), a directional three-way classification distinguishing Masking (positive text, negative acoustics), Coping (negative text, positive acoustics), and Congruent utterances, grounded in Gross's process model of emotion regulation. We present three further contributions: (i) CADD-Journal, a 1,800-sample TTS dataset with a shared-sentence-pool design that provably isolates acoustic signal from textual content; (ii) DACM, a dual-encoder model with asymmetric cross-modal attention that re-solves a gradient degeneracy in pooled fusion, achieving macro-F1 0.711 - with a four-step ablation demonstrating that asymmetric attention is the dominant driver (+ 0.242) while the DIM is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.