Multimodal Storytelling via Generative Adversarial Imitation Learning

Zhiqian Chen; Xuchao Zhang; Arnold P. Boedihardjo; Jing Dai and; Chang-Tien Lu

arXiv:1712.01455·cs.AI·December 6, 2017·1 cites

Multimodal Storytelling via Generative Adversarial Imitation Learning

Zhiqian Chen, Xuchao Zhang, Arnold P. Boedihardjo, Jing Dai and, Chang-Tien Lu

PDF

Open Access

TL;DR

This paper introduces MIL-GAN, a multimodal imitation learning approach using GANs to model user interests in storytelling, effectively capturing cross-modality information and outperforming existing methods in aligning with user preferences.

Contribution

It presents a novel multimodal imitation learning framework with GANs that directly models user interests from diverse data sources for storytelling.

Findings

01

Outperforms competing methods in user preference alignment

02

Successfully models cross-modality information in storytelling

03

Demonstrates effectiveness through a user study

Abstract

Deriving event storylines is an effective summarization method to succinctly organize extensive information, which can significantly alleviate the pain of information overload. The critical challenge is the lack of widely recognized definition of storyline metric. Prior studies have developed various approaches based on different assumptions about users' interests. These works can extract interesting patterns, but their assumptions do not guarantee that the derived patterns will match users' preference. On the other hand, their exclusiveness of single modality source misses cross-modality information. This paper proposes a method, multimodal imitation learning via generative adversarial networks(MIL-GAN), to directly model users' interests as reflected by various data. In particular, the proposed model addresses the critical challenge by imitating users' demonstrated storylines. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Topic Modeling · Music and Audio Processing