Poet: Product-oriented Video Captioner for E-commerce

Shengyu Zhang; Ziqi Tan; Jin Yu; Zhou Zhao; Kun Kuang; Jie Liu,; Jingren Zhou; Hongxia Yang; Fei Wu

arXiv:2008.06880·cs.CV·August 18, 2020

Poet: Product-oriented Video Captioner for E-commerce

Shengyu Zhang, Ziqi Tan, Jin Yu, Zhou Zhao, Kun Kuang, Jie Liu,, Jingren Zhou, Hongxia Yang, Fei Wu

PDF

1 Repo

TL;DR

Poet is a novel product-oriented video captioning framework that uses graph representations and knowledge-enhanced inference to generate detailed, product-focused descriptions for e-commerce videos, improving caption quality and relevance.

Contribution

The paper introduces Poet, a new framework that models videos as product-oriented graphs and employs knowledge-enhanced inference, advancing product-specific video captioning techniques.

Findings

01

Poet outperforms previous methods in caption quality and product aspect capturing.

02

The framework improves lexical diversity in generated captions.

03

Experiments on BFVD and FFVD datasets validate the effectiveness of Poet.

Abstract

In e-commerce, a growing number of user-generated videos are used for product promotion. How to generate video descriptions that narrate the user-preferred product characteristics depicted in the video is vital for successful promoting. Traditional video captioning methods, which focus on routinely describing what exists and happens in a video, are not amenable for product-oriented video captioning. To address this problem, we propose a product-oriented video captioner framework, abbreviated as Poet. Poet firstly represents the videos as product-oriented spatial-temporal graphs. Then, based on the aspects of the video-associated product, we perform knowledge-enhanced spatial-temporal inference on those graphs for capturing the dynamic change of fine-grained product-part characteristics. The knowledge leveraging module in Poet differs from the traditional design by performing knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shengyuzhang/Poet
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.