Transformer Meets Gated Residual Networks To Enhance Photoplethysmogram Artifact Detection Informed by Mutual Information Neural Estimation

Thanh-Dung Le; Clara Macabiau; K\'evin Albert; Symeon Chatzinotas; Philippe Jouvet; Rita Noumeir

arXiv:2405.16177·eess.SP·May 27, 2025·1 cites

Transformer Meets Gated Residual Networks To Enhance Photoplethysmogram Artifact Detection Informed by Mutual Information Neural Estimation

Thanh-Dung Le, Clara Macabiau, K\'evin Albert, Symeon Chatzinotas, Philippe Jouvet, Rita Noumeir

PDF

Open Access

TL;DR

This paper explores how Gated Residual Networks improve Transformer models for PPG artifact detection in pediatric intensive care, emphasizing the benefits of unsupervised learning, activation functions, and mutual information analysis.

Contribution

It introduces the use of Gated Residual Networks within Transformers, analyzes activation functions with MINE, and compares integration methods, advancing Transformer performance in limited data scenarios.

Findings

01

GLU with sigmoid activation achieves 0.98 accuracy

02

Unsupervised pretraining enhances Transformer performance

03

GRN as an intermediary layer outperforms integration within Attention

Abstract

This study delves into the effectiveness of various learning methods in improving Transformer models, focusing particularly on the Gated Residual Network Transformer (GRN-Transformer) in the context of pediatric intensive care units (PICU) with limited data availability. Our findings indicate that Transformers trained via supervised learning are less effective compared to MLP, CNN, and LSTM networks in such environments. Yet, leveraging unsupervised and self-supervised learning on unannotated data, with subsequent fine-tuning on annotated data, notably enhances Transformer performance, although not to the level of the GRN-Transformer. Central to our research is the analysis of different activation functions for the Gated Linear Unit (GLU), a crucial element of the GRN structure. We also employ Mutual Information Neural Estimation (MINE) to evaluate the GRN's contribution. Additionally,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNon-Invasive Vital Sign Monitoring · EEG and Brain-Computer Interfaces · Hemodynamic Monitoring and Therapy

MethodsAttention Is All You Need · Tanh Activation · Long Short-Term Memory · Softmax · Layer Normalization · Sigmoid Activation · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam