Beyond Hearing: Learning Task-agnostic ExG Representations from Earphones via Physiology-informed Tokenization
Hyungjun Yoon, Seungjoo Lee, Yu Yvonne Wu, Xiaomeng Chen, Taiting Lu, Freddy Yifei Liu, Taeckyung Lee, Hyeongheon Cha, Haochen Zhao, Gaoteng Zhao, Sung-Ju Lee, Cecilia Mascolo, Dongyao Chen, Lili Qiu

TL;DR
This paper introduces a scalable, task-agnostic approach for learning generalizable electrophysiological signals from earphones using physiology-informed tokenization, enabling robust analysis across multiple human senses in real-world settings.
Contribution
The paper presents Physiology-informed Multi-band Tokenization (PiMT), a novel method that decomposes ExG signals into meaningful tokens for versatile, task-agnostic representation learning from earphone data.
Findings
PiMT outperforms state-of-the-art methods across diverse tasks.
Collected 50 hours of real-world ExG data with earphones.
Demonstrated effective analysis across five human senses.
Abstract
Electrophysiological (ExG) signals offer valuable insights into human physiology, yet building foundation models that generalize across everyday tasks remains challenging due to two key limitations: (i) insufficient data diversity, as most ExG recordings are collected in controlled labs with bulky, expensive devices; and (ii) task-specific model designs that require tailored processing (i.e., targeted frequency filters) and architectures, which limit generalization across tasks. To address these challenges, we introduce an approach for scalable, task-agnostic ExG monitoring in the wild. We collected 50 hours of unobtrusive free-living ExG data with an earphone-based hardware prototype to narrow the data diversity gap. At the core of our approach is Physiology-informed Multi-band Tokenization (PiMT), which decomposes ExG signals into 12 physiology-informed tokens, followed by a…
Peer Reviews
Decision·ICLR 2026 Poster
- Solid prototype building and dataset collection efforts. The dataset is gonna be highly valuable to the community if made public, especially considering its multimodality nature and the study design that covers a wide range of tasks of interest. - Solid experimental efforts. The method is applied on a wide range of tasks, including both private datasets and public datasets, and showed good performance. - Very interesting saliency analysis, showing both how different frequency components are be
1. It is unclear if the self-collected dataset is superior comparing to existing non free-living datasets. Specifically: - The paper lacks experiments showing how the free-living dataset compares to existing larger-scale pre-training datasets, for example, the multimodal sleep datasets that contains thousands of hours of data (You snooze you win challenge, or TU datasets, as used in [1, 2, 3]). The paper hypothesize the potential benefits of collecting free-living ExG data, but the experimental
1. Ambitious Data Collection and New Benchmark: The DailySense dataset, comprising the largest known free-living ExG recordings across diverse human activities and all five senses, marks a significant step toward real-world applicability for physiological sensing. The use of an earphone-based device (NeuroBuds) demonstrates strong engineering innovation, promising unobtrusive and scalable physiological monitoring. 2. Principled Tokenization Approach: The PiMT framework’s multi-band tokenization
1. Limited Generalization to Unseen Subjects and Modest Cohort Size: While the participant count (N=22) is comparable to prior lab-based ExG studies, it remains insufficient to support strong claims of robust population-level generalization. This limitation is clearly evidenced by the significant performance drop in the cross-subject setting (Table 7), where the average F1-score falls to ~58%. Although the authors acknowledge this challenge and provide Leave-One-Subject-Out (LOSO) results (Figur
- The authors tackle the important challenge of model generalization in ExG modeling. In many cases in this modeling domain, generalization is largely accounted for with dataset scale, requiring more resources. However, the novel tokenization scheme proposed in this work achieves improvements in generalization by developing an encoding scheme that seeks to explicitly account for known physiological principals of ExG signals. - Proposed strategies (pre-training and PiMT) consistently outperform
- The physiologically informed aspect of PiMT assumes electrode configurations that can yield signals of interest at each frequency band and does not account for mixing of these signals, potentially limiting generality of the scheme to different hardware configurations. For instance, computing EOG signal features from occipital electrodes may not have much physiological meaning (the model would still likely learn features relevant to the training task, but the physiological intuition diminishes
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Emotion and Mood Recognition · Neural dynamics and brain function
