# HIPER-CHAD: Hybrid Integrated Prediction-Error Reconstruction-Based Anomaly Detection for Multivariate Indoor Environmental Time-Series Data

**Authors:** Vandha Pradwiyasma Widartha, Chang Soo Kim

PMC · DOI: 10.3390/s26010171 · Sensors (Basel, Switzerland) · 2025-12-26

## TL;DR

This paper introduces HIPER-CHAD, a new model that detects subtle anomalies in indoor environmental data using a hybrid approach combining LSTM and VAE.

## Contribution

HIPER-CHAD introduces a novel hybrid model that separates temporal modeling from probabilistic uncertainty to improve anomaly detection in multivariate time-series data.

## Key findings

- HIPER-CHAD achieves an F1-score of 0.8571, outperforming traditional models.
- A 20-step window yields an optimal F1-score of 0.884, showing the model's effectiveness.
- The model maintains perfect recall while detecting anomalies in complex datasets.

## Abstract

This study introduces the Hybrid Integrated Prediction-Error Reconstruction-based Anomaly Detection (HIPER-CHAD) model, which addresses the challenge of reliably detecting subtle anomalies in noisy multivariate indoor environmental time-series data. The main objective is to separate temporal modeling of normal behavior from probabilistic modeling of prediction uncertainty, ensuring that the anomaly score becomes robust to stochastic fluctuations while remaining sensitive to truly abnormal events. The HIPER-CHAD architecture first employs a Long Short-Term Memory (LSTM) network to forecast the next time step’s sensor readings, subsequently forming a residual error vector that captures deviations from the expected temporal pattern. A Variational Autoencoder (VAE) is then trained on these residual vectors rather than on the raw sensor data to learn the distribution of normal prediction errors and quantify their probabilistic unicity. The final anomaly score integrates the VAE’s reconstruction error with its Kullback–Leibler (KL) divergence, yielding a statistically grounded measure that jointly reflects the magnitude and distributional abnormality of the residual. The proposed model is evaluated on a real-world multivariate indoor environmental dataset and compared against eight traditional machine learning and deep learning baselines using a synthetic ground truth generated by a 99th percentile-based criterion. HIPER-CHAD achieves an F1-score of 0.8571, outperforming the next best model, the LSTM Autoencoder (F1 = 0.8095), while maintaining perfect recall. Furthermore, a time-step sensitivity analysis demonstrates that a 20-step window yields an optimal F1-score of 0.884, indicating that the proposed residual-based hybrid design provides a reliable and accurate framework for anomaly detection in complex multivariate time-series data.

## Full-text entities

- **Diseases:** LSTM (MESH:D000088562), fire (MESH:D000092422), injury to (MESH:D014947), CHAD anomaly (MESH:D000013)
- **Chemicals:** Carbon Dioxide (MESH:D002245), VOC (MESH:D055549)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12788105/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12788105/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12788105/full.md

---
Source: https://tomesphere.com/paper/PMC12788105