Multimodal Deep Models for Predicting Affective Responses Evoked by   Movies

Ha Thi Phuong Thao; Dorien Herremans; Gemma Roig

arXiv:1909.06957·cs.CV·September 18, 2019

Multimodal Deep Models for Predicting Affective Responses Evoked by Movies

Ha Thi Phuong Thao, Dorien Herremans, Gemma Roig

PDF

1 Repo

TL;DR

This study develops multimodal deep learning models combining video and audio features to predict viewers' emotional responses to movies, finding audio features more predictive and optical flow more informative than raw video content.

Contribution

Introduces hybrid multimodal models using both visual and audio features, with a comparison of sequential and non-sequential neural network approaches for emotion prediction.

Findings

01

Audio features outperform video features in emotion prediction.

02

Optical flow features are more informative than RGB frames.

03

Predicting emotions independently per time step slightly outperforms LSTM-based sequential models.

Abstract

The goal of this study is to develop and analyze multimodal models for predicting experienced affective responses of viewers watching movie clips. We develop hybrid multimodal prediction models based on both the video and audio of the clips. For the video content, we hypothesize that both image content and motion are crucial features for evoked emotion prediction. To capture such information, we extract features from RGB frames and optical flow using pre-trained neural networks. For the audio model, we compute an enhanced set of low-level descriptors including intensity, loudness, cepstrum, linear predictor coefficients, pitch and voice quality. Both visual and audio features are then concatenated to create audio-visual features, which are used to predict the evoked emotion. To classify the movie clips into the corresponding affective response categories, we propose two approaches based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ivyha010/emotionprediction
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory