Towards Micro-video Thumbnail Selection via a Multi-label   Visual-semantic Embedding Model

Liu Bo

arXiv:2202.02930·cs.IR·February 8, 2022

Towards Micro-video Thumbnail Selection via a Multi-label Visual-semantic Embedding Model

Liu Bo

PDF

Open Access

TL;DR

This paper introduces a multi-label visual-semantic embedding model for selecting micro-video thumbnails that align with user interests, utilizing shared semantic space and attention mechanisms to improve relevance and attractiveness.

Contribution

The paper proposes a novel multi-label embedding approach with attention mechanisms to better match video frames with user interests for thumbnail selection.

Findings

01

Model significantly outperforms state-of-the-art baselines

02

Effective in capturing user interests through semantic embedding

03

Improves thumbnail relevance and attractiveness

Abstract

The thumbnail, as the first sight of a micro-video, plays a pivotal role in attracting users to click and watch. While in the real scenario, the more the thumbnails satisfy the users, the more likely the micro-videos will be clicked. In this paper, we aim to select the thumbnail of a given micro-video that meets most users` interests. Towards this end, we present a multi-label visual-semantic embedding model to estimate the similarity between the pair of each frame and the popular topics that users are interested in. In this model, the visual and textual information is embedded into a shared semantic space, whereby the similarity can be measured directly, even the unseen words. Moreover, to compare the frame to all words from the popular topics, we devise an attention embedding space associated with the semantic-attention projection. With the help of these two embedding spaces, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Advanced Computing and Algorithms · Misinformation and Its Impacts