Hierachical Delta-Attention Method for Multimodal Fusion
Kunjal Panchal

TL;DR
This paper introduces a hierarchical delta-attention method for multimodal fusion that preserves long-range dependencies and local differences across modalities, achieving competitive emotion classification accuracy with fewer parameters.
Contribution
It proposes a novel hierarchical delta-attention approach for multimodal fusion, emphasizing local differences and cross-modal dependencies, which is a new application of attention in this field.
Findings
Achieves near state-of-the-art accuracy in emotion classification.
Uses almost half the parameters compared to existing methods.
Effectively captures local and global contextual information across modalities.
Abstract
In vision and linguistics; the main input modalities are facial expressions, speech patterns, and the words uttered. The issue with analysis of any one mode of expression (Visual, Verbal or Vocal) is that lot of contextual information can get lost. This asks researchers to inspect multiple modalities to get a thorough understanding of the cross-modal dependencies and temporal context of the situation to analyze the expression. This work attempts at preserving the long-range dependencies within and across different modalities, which would be bottle-necked by the use of recurrent networks and adds the concept of delta-attention to focus on local differences per modality to capture the idiosyncrasy of different people. We explore a cross-attention fusion technique to get the global view of the emotion expressed through these delta-self-attended modalities, in order to fuse all the local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Speech and Audio Processing · Video Analysis and Summarization
