Audio-based Near-Duplicate Video Retrieval with Audio Similarity   Learning

Pavlos Avgoustinakis; Giorgos Kordopatis-Zilos; Symeon Papadopoulos,; Andreas L. Symeonidis; Ioannis Kompatsiaris

arXiv:2010.08737·cs.MM·January 12, 2021

Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning

Pavlos Avgoustinakis, Giorgos Kordopatis-Zilos, Symeon Papadopoulos,, Andreas L. Symeonidis, Ioannis Kompatsiaris

PDF

1 Repo

TL;DR

This paper introduces AuSiL, a novel audio similarity learning method for near-duplicate video retrieval that leverages CNN-based audio descriptors and temporal pattern analysis, demonstrating robustness and competitive performance.

Contribution

The paper presents a new audio similarity learning approach that captures temporal audio patterns using CNNs trained on large-scale audio data, improving robustness to speed transformations.

Findings

01

Achieves competitive results against state-of-the-art methods.

02

Robust to speed transformations in audio duplicates.

03

Effectively captures temporal audio patterns.

Abstract

In this work, we address the problem of audio-based near-duplicate video retrieval. We propose the Audio Similarity Learning (AuSiL) approach that effectively captures temporal patterns of audio similarity between video pairs. For the robust similarity calculation between two videos, we first extract representative audio-based video descriptors by leveraging transfer learning based on a Convolutional Neural Network (CNN) trained on a large scale dataset of audio events, and then we calculate the similarity matrix derived from the pairwise similarity of these descriptors. The similarity matrix is subsequently fed to a CNN network that captures the temporal structures existing within its content. We train our network following a triplet generation process and optimizing the triplet loss function. To evaluate the effectiveness of the proposed approach, we have manually annotated two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mever-team/ausil
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTriplet Loss