Audio Content based Geotagging in Multimedia

Anurag Kumar; Benjamin Elizalde; Bhiksha Raj

arXiv:1606.02816·cs.SD·November 14, 2016

Audio Content based Geotagging in Multimedia

Anurag Kumar, Benjamin Elizalde, Bhiksha Raj

PDF

TL;DR

This paper introduces a novel approach to geotag multimedia recordings by analyzing their audio content, leveraging sound class composition and matrix factorization to infer location information.

Contribution

It presents a new method that uses audio-based semantic analysis and matrix factorization for geotagging multimedia recordings, which is a novel approach.

Findings

01

Effective identification of location from audio content.

02

Utilization of sound class composition for geotagging.

03

Application of matrix factorization techniques to audio data.

Abstract

In this paper we propose methods to extract geographically relevant information in a multimedia recording using its audio. Our method primarily is based on the fact that urban acoustic environment consists of a variety of sounds. Hence, location information can be inferred from the composition of sound events/classes present in the audio. More specifically, we adopt matrix factorization techniques to obtain semantic content of recording in terms of different sound classes. These semantic information are then combined to identify the location of recording.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.