# Hierarchical Self-Distillation with Attention for Class-Imbalanced Acoustic Event Classification in Elevators

**Authors:** Shengying Yang, Lingyan Chou, He Li, Zhenyu Xu, Boyang Feng, Jingsheng Lei

PMC · DOI: 10.3390/s26020589 · Sensors (Basel, Switzerland) · 2026-01-15

## TL;DR

This paper introduces a new deep learning framework to improve acoustic event detection in elevators, especially for rare events, using self-distillation and attention mechanisms.

## Contribution

A hierarchical self-distillation framework with attention and focal loss is proposed to handle class imbalance and acoustic interference in elevator monitoring.

## Key findings

- The model achieves over 90% accuracy and F1-score on elevator acoustic datasets.
- It significantly improves detection of rare acoustic events in class-imbalanced scenarios.
- The framework effectively reduces interference from overlapping sounds in confined spaces.

## Abstract

Acoustic-based anomaly detection in elevators is crucial for predictive maintenance and operational safety, yet it faces significant challenges in real-world settings, including pervasive multi-source acoustic interference within confined spaces and severe class imbalance in collected data, which critically degrades the detection performance for minority yet critical acoustic events. To address these issues, this study proposes a novel hierarchical self-distillation framework. The method embeds auxiliary classifiers into the intermediate layers of a backbone network, creating a deep teacher–shallow student knowledge transfer paradigm optimized jointly via Kullback–Leibler divergence and feature alignment losses. A self-attentive temporal pooling layer is introduced to adaptively weigh discriminative time-frequency features, thereby mitigating temporal overlap interference, while a focal loss function is employed specifically in the teacher model to recalibrate the learning focus towards hard-to-classify minority samples. Extensive evaluations on the public UrbanSound8K dataset and a proprietary industrial elevator audio dataset demonstrate that the proposed model achieves superior performance, exceeding 90% in both accuracy and F1-score. Notably, it yields substantial improvements in recognizing rare events, validating its robustness for elevator acoustic monitoring.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12845887/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12845887/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/PMC12845887/full.md

---
Source: https://tomesphere.com/paper/PMC12845887