Scene-driven Retrieval in Edited Videos using Aesthetic and Semantic   Deep Features

Lorenzo Baraldi; Costantino Grana; Rita Cucchiara

arXiv:1604.02546·cs.CV·April 12, 2016

Scene-driven Retrieval in Edited Videos using Aesthetic and Semantic Deep Features

Lorenzo Baraldi, Costantino Grana, Rita Cucchiara

PDF

TL;DR

This paper introduces a scene-driven video retrieval method that uses deep semantic and aesthetic features to identify and visualize the most significant parts of edited videos in response to textual queries.

Contribution

It proposes a novel retrieval pipeline that segments videos into scenes, retrieves relevant scenes with deep learning, and visualizes them with meaningful thumbnails.

Findings

01

Effective retrieval of significant scenes demonstrated

02

Thumbnails are both semantically meaningful and aesthetically remarkable

03

Quantitative results show improved retrieval accuracy

Abstract

This paper presents a novel retrieval pipeline for video collections, which aims to retrieve the most significant parts of an edited video for a given query, and represent them with thumbnails which are at the same time semantically meaningful and aesthetically remarkable. Videos are first segmented into coherent and story-telling scenes, then a retrieval algorithm based on deep learning is proposed to retrieve the most significant scenes for a textual query. A ranking strategy based on deep features is finally used to tackle the problem of visualizing the best thumbnail. Qualitative and quantitative experiments are conducted on a collection of edited videos to demonstrate the effectiveness of our approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.