Multichannel Attention Network for Analyzing Visual Behavior in Public   Speaking

Rahul Sharma; Tanaya Guha; Gaurav Sharma

arXiv:1707.06830·cs.MM·July 24, 2017

Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking

Rahul Sharma, Tanaya Guha, Gaurav Sharma

PDF

TL;DR

This paper introduces a multichannel attention network that analyzes visual cues from TED talks to predict their popularity, demonstrating that non-verbal visual features are highly informative and interpretable for public speaking success.

Contribution

It presents a novel attention-based LSTM model that leverages visual features to predict talk popularity and provides interpretability of visual cue importance over time.

Findings

01

Visual cues alone predict popularity with high accuracy.

02

The model learns human-like attention mechanisms for interpretability.

03

Visual features significantly contribute to public speaking success.

Abstract

Public speaking is an important aspect of human communication and interaction. The majority of computational work on public speaking concentrates on analyzing the spoken content, and the verbal behavior of the speakers. While the success of public speaking largely depends on the content of the talk, and the verbal behavior, non-verbal (visual) cues, such as gestures and physical appearance also play a significant role. This paper investigates the importance of visual cues by estimating their contribution towards predicting the popularity of a public lecture. For this purpose, we constructed a large database of more than $1800$ TED talk videos. As a measure of popularity of the TED talks, we leverage the corresponding (online) viewers' ratings from YouTube. Visual cues related to facial and physical appearance, facial expressions, and pose variations are extracted from the video frames…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.