Neck-Learn: Attention-Based Multiple Instance Learning and Ensemble Framework for Ecological Momentary Assessment

Ahsan Jamal Cheema

arXiv:2605.02700·eess.AS·May 5, 2026

Neck-Learn: Attention-Based Multiple Instance Learning and Ensemble Framework for Ecological Momentary Assessment

Ahsan Jamal Cheema

PDF

TL;DR

This paper presents a hybrid deep learning framework combining CNN-based multiple instance learning and gradient-boosted trees to improve ambulatory detection of vocal hyperfunction from daily neck-surface accelerometer data.

Contribution

It introduces a novel architecture that preserves within-day temporal dynamics, outperforming existing challenge baselines in detecting vocal hyperfunction.

Findings

01

Achieved AUC of 0.879 for PVH detection

02

Outperformed challenge baselines with AUCs of 0.82 and 0.77

03

Provided clinically relevant insights into vocal hyperfunction

Abstract

Vocal hyperfunction (VH) is a prevalent voice disorder whose ambulatory detection remains challenging despite extensive daily voice data. Prior approaches capture week-long neck-surface accelerometer recordings but collapse them into fixed-length subject-level feature vectors, discarding within-day temporal dynamics encoding nuanced voicing feature interactions. We introduce a novel hybrid architecture combining gradient-boosted trees on day-level distributional features with a CNN-based multiple instance learning (MIL) framework that preserves and learns from from temporal dynamics throughout each day. On the held-out test set, our model exceeds the challenge baselines (AUC: 0.82 PVH, 0.77 NPVH), achieving AUCs of 0.879 for PVH (Rank 5) and 0.848 for NPVH (Rank 3), while also providing insights into clinically relevant information about both pathologies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.