Emotion Understanding in Videos Through Body, Context, and   Visual-Semantic Embedding Loss

Panagiotis Paraskevas Filntisis; Niki Efthymiou; Gerasimos; Potamianos; Petros Maragos

arXiv:2010.16396·cs.CV·November 2, 2020

Emotion Understanding in Videos Through Body, Context, and Visual-Semantic Embedding Loss

Panagiotis Paraskevas Filntisis, Niki Efthymiou, Gerasimos, Potamianos, Petros Maragos

PDF

1 Repo

TL;DR

This paper introduces a novel approach for emotion understanding in videos by integrating body language, environmental context, and semantic visual embeddings, achieving state-of-the-art results in a challenging benchmark.

Contribution

It extends the Temporal Segment Network framework to incorporate context and semantic embeddings, advancing emotion recognition in videos.

Findings

01

Achieved an Emotion Recognition Score of 0.26235 on BoLD test set.

02

Surpassed previous best result of 0.2530.

03

Validated on the BoLD dataset with improved performance.

Abstract

We present our winning submission to the First International Workshop on Bodily Expressed Emotion Understanding (BEEU) challenge. Based on recent literature on the effect of context/environment on emotion, as well as visual representations with semantic meaning using word embeddings, we extend the framework of Temporal Segment Network to accommodate these. Our method is verified on the validation set of the Body Language Dataset (BoLD) and achieves 0.26235 Emotion Recognition Score on the test set, surpassing the previous best result of 0.2530.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

filby89/NTUA-BEEU-eccv2020
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.