StreamAAD: Decoding Spatial Auditory Attention with a Streaming   Architecture

Zelin Qiu; Dingding Yao; Junfeng Li

arXiv:2408.13522·cs.SD·August 28, 2024

StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture

Zelin Qiu, Dingding Yao, Junfeng Li

PDF

Open Access

TL;DR

StreamAAD introduces a streaming architecture for spatial auditory attention decoding that models inter-window relationships, significantly improving performance in the Chinese AAD Challenge.

Contribution

The paper proposes a novel streaming decoding architecture, StreamAAD, that considers relationships between decision windows for improved spatial auditory attention decoding.

Findings

01

Achieved first place in the Chinese AAD Challenge.

02

Significantly outperformed baseline methods.

03

Demonstrated the effectiveness of inter-window relationship modeling.

Abstract

In this paper, we present our approach for the Track 1 of the Chinese Auditory Attention Decoding (Chinese AAD) Challenge at ISCSLP 2024. Most existing spatial auditory attention decoding (Sp-AAD) methods employ an isolated window architecture, focusing solely on global invariant features without considering relationships between different decision windows, which can lead to suboptimal performance. To address this issue, we propose a novel streaming decoding architecture, termed StreamAAD. In StreamAAD, decision windows are input to the network as a sequential stream and decoded in order, allowing for the modeling of inter-window relationships. Additionally, we employ a model ensemble strategy, achieving significant better performance than the baseline, ranking First in the challenge.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Tactile and Sensory Interactions · Multisensory perception and integration