Controllable Video Captioning with an Exemplar Sentence

Yitian Yuan; Lin Ma; Jingwen Wang; Wenwu Zhu

arXiv:2112.01073·cs.CV·December 3, 2021

Controllable Video Captioning with an Exemplar Sentence

Yitian Yuan, Lin Ma, Jingwen Wang, Wenwu Zhu

PDF

1 Repo

TL;DR

This paper introduces a novel method for controllable video captioning that generates syntactically customized captions based on exemplar sentences, enhancing diversity and semantic accuracy.

Contribution

We propose the Syntax Modulated Caption Generator (SMCG) that conditions caption syntax on exemplar sentences within an encoder-decoder framework.

Findings

01

Effective syntax control in video captioning demonstrated

02

Generated captions preserve semantic content and follow exemplar syntax

03

Method increases diversity of generated video captions

Abstract

In this paper, we investigate a novel and challenging task, namely controllable video captioning with an exemplar sentence. Formally, given a video and a syntactically valid exemplar sentence, the task aims to generate one caption which not only describes the semantic contents of the video, but also follows the syntactic form of the given exemplar sentence. In order to tackle such an exemplar-based video captioning task, we propose a novel Syntax Modulated Caption Generator (SMCG) incorporated in an encoder-decoder-reconstructor architecture. The proposed SMCG takes video semantic representation as an input, and conditionally modulates the gates and cells of long short-term memory network with respect to the encoded syntactic information of the given exemplar sentence. Therefore, SMCG is able to control the states for word prediction and achieve the syntax customized caption generation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yytzsy/smcg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMemory Network