Prediction method of Soundscape Impressions using Environmental Sounds and Aerial Photographs
Yusuke Ono, Sunao Hara, Masanobu Abe

TL;DR
This study develops a deep learning approach combining sound and aerial image data to predict city soundscape impressions like pleasantness and eventfulness, aiding urban planning and tourism.
Contribution
It introduces a novel multi-modal prediction model integrating acoustic and aerial image features for soundscape impression estimation.
Findings
Aerial photographs and sound-source features effectively predict soundscape impressions.
Predicted sound-source features from acoustic and image data perform nearly as well as oracle features.
The method benefits urban planning and tourism development through soundscape analysis.
Abstract
We investigate an method for quantifying city characteristics based on impressions of a sound environment. The quantification of the city characteristics will be beneficial to government policy planning, tourism projects, etc. In this study, we try to predict two soundscape impressions, meaning pleasantness and eventfulness, using sound data collected by the cloud-sensing method. The collected sounds comprise meta information of recording location using Global Positioning System. Furthermore, the soundscape impressions and sound-source features are separately assigned to the cloud-sensing sounds by assessments defined using Swedish Soundscape-Quality Protocol, assessing the quality of the acoustic environment. The prediction models are built using deep neural networks with multi-layer perceptron for the input of 10-second sound and the aerial photographs of its location. An acoustic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
