# A Dual-input deep learning architecture for classification and latency estimation in ABR signals

**Authors:** Youssef Darahem, Oguz Yilmaz, Halil B. Saldirim, Berna Mutlu, Hasan F. Ates, Bahadir K. Gunturk

PMC · DOI: 10.3389/fmed.2025.1693921 · Frontiers in Medicine · 2025-11-11

## TL;DR

This paper introduces a deep learning model that automatically detects and measures wave V in ABR signals, improving efficiency and accuracy in hearing assessments.

## Contribution

A multi-task deep learning model with a paired-signal approach for simultaneous wave V detection and latency prediction in ABR signals.

## Key findings

- The joint model achieves an F1-score of 0.92 for wave V classification.
- The model attains an R2 of 0.90 for latency prediction.
- The paired-signal approach improves performance over single-input methods.

## Abstract

Auditory brainstem response (ABR) is an objective neurophysiological evaluation designed to measure the electrical activity originating from the auditory nerve and brainstem in response to auditory stimulation. This assessment objectively records synchronous neural activity as it propagates along the auditory pathway. It is characterized by several distinct waves, most notably waves I, III, and V. Wave V plays a central clinical role since its presence and latency are routinely used to assess a patient's hearing status. However, manual identification and localization of wave V are time consuming and subjective. Previous work has explored automated detection methods to reduce this burden.

In this paper, we make two primary contributions. First, we propose a multi-task deep learning pipeline that simultaneously (i) detects the presence of wave V and (ii) predicts its latency, thus eliminating the need for separate manual interpretation steps and enhancing clinical usability. Second, inspired by the audiologist's practice of comparing responses at multiple click sound intensities—specifically, using responses at high intensities, where waves are more prominent, as reference—we introduce a paired-signal approach. Each input to our deep learning model consists of the test signal together with its corresponding 80 dB reference from the same recording session. This provides the model with richer contextual information, and we show that the paired-signal approach improves over the single input approach. For multi-task learning, we design a network that consists of a backbone and two branches, one for latency prediction and the other for classification of whether wave V exists or not. We first train a latency-prediction network and then freeze its feature extraction layers to initialize a classification branch. Finally, we fine-tune the entire network using a joint loss function that balances classification and regression objectives.

Experimental results demonstrate that our joint model1 outperforms conventional single-task approaches. For classification, it achieves an F1-score of 0.92; for latency regression, it attains an R2 of 0.90.

Our findings highlight the promise of convolutional neural networks for enhancing ABR analysis and underscore their potential to streamline clinical workflows in the diagnosis of auditory disorders.

## Full-text entities

- **Diseases:** auditory disorders (MESH:D006311)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12644055/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12644055/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/PMC12644055/full.md

---
Source: https://tomesphere.com/paper/PMC12644055