Single-Channel Distance-Based Source Separation for Mobile GPU in   Outdoor and Indoor Environments

Hanbin Bae; Byungjun Kang; Jiwon Kim; Jaeyong Hwang; Hosang Sung,; Hoon-Young Cho

arXiv:2501.03045·eess.AS·January 7, 2025

Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments

Hanbin Bae, Byungjun Kang, Jiwon Kim, Jaeyong Hwang, Hosang Sung,, Hoon-Young Cho

PDF

Open Access

TL;DR

This paper introduces a novel distance-based source separation model optimized for mobile GPUs, specifically targeting outdoor environments, and demonstrates significant improvements in energy efficiency and real-time processing.

Contribution

The study presents a new outdoor-focused DSS model using advanced techniques like conformer blocks and linear RSA, optimized for mobile GPU deployment.

Findings

01

Enhanced separation performance in outdoor environments.

02

Improved energy efficiency on mobile devices.

03

Real-time inference capability achieved.

Abstract

This study emphasizes the significance of exploring distance-based source separation (DSS) in outdoor environments. Unlike existing studies that primarily focus on indoor settings, the proposed model is designed to capture the unique characteristics of outdoor audio sources. It incorporates advanced techniques, including a two-stage conformer block, a linear relation-aware self-attention (RSA), and a TensorFlow Lite GPU delegate. While the linear RSA may not capture physical cues as explicitly as the quadratic RSA, the linear RSA enhances the model's context awareness, leading to improved performance on the DSS that requires an understanding of physical cues in outdoor and indoor environments. The experimental results demonstrated that the proposed model overcomes the limitations of existing approaches and considerably enhances energy efficiency and real-time inference speed on mobile…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Underwater Vehicles and Communication Systems

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus