City-Identification of Flickr Videos Using Semantic Acoustic Features

Benjamin Elizalde; Guan-Lin Chao; Ming Zeng; Ian Lane

arXiv:1607.03257·cs.MM·July 13, 2016

City-Identification of Flickr Videos Using Semantic Acoustic Features

Benjamin Elizalde, Guan-Lin Chao, Ming Zeng, Ian Lane

PDF

TL;DR

This paper introduces a novel audio-only method for city-identification of videos using semantic acoustic features, demonstrating that urban sounds can effectively indicate city location and improve identification accuracy.

Contribution

The paper presents a new semantic acoustic feature extraction method for city-identification, showing improved performance without relying on visual or metadata modalities.

Findings

01

Improved state-of-the-art accuracy in city-identification

02

Semantic acoustic features correlate strongly with city location

03

Urban sound taxonomy enhances identification performance

Abstract

City-identification of videos aims to determine the likelihood of a video belonging to a set of cities. In this paper, we present an approach using only audio, thus we do not use any additional modality such as images, user-tags or geo-tags. In this manner, we show to what extent the city-location of videos correlates to their acoustic information. Success in this task suggests improvements can be made to complement the other modalities. In particular, we present a method to compute and use semantic acoustic features to perform city-identification and the features show semantic evidence of the identification. The semantic evidence is given by a taxonomy of urban sounds and expresses the potential presence of these sounds in the city- soundtracks. We used the MediaEval Placing Task set, which contains Flickr videos labeled by city. In addition, we used the UrbanSound8K set containing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.