Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement   Detection In Videos

Shervin Minaee; Imed Bouazizi; Prakash Kolan; Hossein Najafzadeh

arXiv:1806.08612·cs.CV·June 25, 2018·6 cites

Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement Detection In Videos

Shervin Minaee, Imed Bouazizi, Prakash Kolan, Hossein Najafzadeh

PDF

Open Access

TL;DR

This paper introduces Ad-Net, a two-stream audio-visual CNN that detects commercials in videos by analyzing both audio and visual content, enabling personalized advertisement replacement.

Contribution

The paper presents a novel two-stream CNN architecture that effectively combines audio and visual information for commercial detection in videos, outperforming models with hand-crafted features.

Findings

01

The model achieved significantly higher accuracy than previous methods.

02

Using both audio and visual data improves detection performance.

03

The dataset included over 50,000 video and commercial shots.

Abstract

Personalized advertisement is a crucial task for many of the online businesses and video broadcasters. Many of today's broadcasters use the same commercial for all customers, but as one can imagine different viewers have different interests and it seems reasonable to have customized commercial for different group of people, chosen based on their demographic features, and history. In this project, we propose a framework, which gets the broadcast videos, analyzes them, detects the commercial and replaces it with a more suitable commercial. We propose a two-stream audio-visual convolutional neural network, that one branch analyzes the visual information and the other one analyzes the audio information, and then the audio and visual embedding are fused together, and are used for commercial detection, and content categorization. We show that using both the visual and audio content of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Music and Audio Processing · Generative Adversarial Networks and Image Synthesis