# Imbalanced Power Spectral Generation for Respiratory Rate and Uncertainty Estimations Based on Photoplethysmography Signal

**Authors:** Soojeong Lee, Mugahed A. Al-antari, Gyanendra Prasad Joshi, Yeong Hyeon Gu

PMC · DOI: 10.3390/s25051437 · 2025-02-26

## TL;DR

This paper introduces a new method to improve the accuracy of respiratory rate estimation in health monitoring systems by addressing data imbalance in biosignal datasets.

## Contribution

A novel methodology combining bootstrap-based imbalanced power spectral generation with machine learning to estimate respiratory rates and uncertainty.

## Key findings

- The proposed GPR-IPSG model achieves a mean absolute error of 0.79 and 1.47 brpm for respiratory rate estimation.
- Bootstrap-based artificial feature curves improve prediction accuracy and stability in imbalanced data scenarios.
- The method enhances home-based monitoring systems by providing reliable respiratory rate predictions.

## Abstract

Respiratory rate (RR) changes in the elderly can indicate serious diseases. Thus, accurate estimation of RRs for cardiopulmonary function is essential for home health monitoring systems. However, machine learning (ML) algorithm errors embedded in health monitoring systems can be problematic in medical decision-making because some data have much larger sample sizes in the training set than others. This difference in sample size implies biosignal data imbalance. Therefore, we propose a novel methodology that combines bootstrap-based imbalanced continuous power spectral generation (IPSG) with ML approaches to estimate RRs and uncertainty to address data imbalance. The sample differences between normal breathing (12–20 breaths per minute (brpm)), dyspnea (≥20 brpm), and hypopnea (<8 brpm) show significant data imbalance, which can affect the learning of ML algorithms. Hence, the normal breathing part with a large amount of data is well-trained. In contrast, the dyspnea and hypopnea parts with relatively fewer data are not well-trained, and this data imbalance makes it difficult to estimate the reference variables of the actual dyspnea and hypopnea data parts, thus generating significant errors. Hence, we apply ML models by mixing artificial feature curves generated using a bootstrap model with the original feature curves to estimate RRs and solve this problem. As a result, the nonparametric bootstrap approach significantly increases the number of artificial feature curves. The generated artificial feature curves are selectively utilized in the highly imbalanced parts. Therefore, we confirm that IPSG is efficiently trained to predict the complex nonlinear relationship between the feature vectors obtained from the photoplethysmography signal and the reference RR. The proposed methodology shows more accurate prediction performance and uncertainty. Combining the proposed Gaussian process regression (GPR) with IPSG based on the Beth Israel Deaconess Medical Center dataset, the mean absolute error of the RR is 0.79 and 1.47 brpm. Our approach achieves high stability and accuracy by randomly mixing original and artificial feature curves. The proposed GPR-IPSG model can improve the performance of clinical home-based monitoring systems and design a reliable framework.

## Full-text entities

- **Diseases:** hypopnea (MESH:D012891), dyspnea (MESH:D004417)

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11902385/full.md

---
Source: https://tomesphere.com/paper/PMC11902385