ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection

Cunhang Fan; Xiaoke Yang; Hongyu Zhang; Ying Chen; Lu Li; Jian Zhou; Zhao Lv

arXiv:2505.10348·cs.HC·May 16, 2025

ListenNet: A Lightweight Spatio-Temporal Enhancement Nested Network for Auditory Attention Detection

Cunhang Fan, Xiaoke Yang, Hongyu Zhang, Ying Chen, Lu Li, Jian Zhou, Zhao Lv

PDF

Open Access 1 Repo

TL;DR

ListenNet is a novel lightweight neural network that effectively captures spatio-temporal dependencies in EEG signals for improved auditory attention detection, outperforming existing methods while being more efficient.

Contribution

The paper introduces ListenNet, a new neural network architecture with specialized modules for enhanced spatio-temporal feature extraction in EEG-based auditory attention detection.

Findings

01

Outperforms state-of-the-art methods on three public datasets.

02

Reduces trainable parameters by approximately 7 times.

03

Effective in both subject-dependent and subject-independent scenarios.

Abstract

Auditory attention detection (AAD) aims to identify the direction of the attended speaker in multi-speaker environments from brain signals, such as Electroencephalography (EEG) signals. However, existing EEG-based AAD methods overlook the spatio-temporal dependencies of EEG signals, limiting their decoding and generalization abilities. To address these issues, this paper proposes a Lightweight Spatio-Temporal Enhancement Nested Network (ListenNet) for AAD. The ListenNet has three key components: Spatio-temporal Dependency Encoder (STDE), Multi-scale Temporal Enhancement (MSTE), and Cross-Nested Attention (CNA). The STDE reconstructs dependencies between consecutive time windows across channels, improving the robustness of dynamic pattern extraction. The MSTE captures temporal features at multiple scales to represent both fine-grained and long-range temporal patterns. In addition, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fchest/listennet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Hearing Loss and Rehabilitation · Speech and Audio Processing