Transformer Meets Gated Residual Networks To Enhance Photoplethysmogram Artifact Detection Informed by Mutual Information Neural Estimation
Thanh-Dung Le, Clara Macabiau, K\'evin Albert, Symeon Chatzinotas, Philippe Jouvet, Rita Noumeir

TL;DR
This paper explores how Gated Residual Networks improve Transformer models for PPG artifact detection in pediatric intensive care, emphasizing the benefits of unsupervised learning, activation functions, and mutual information analysis.
Contribution
It introduces the use of Gated Residual Networks within Transformers, analyzes activation functions with MINE, and compares integration methods, advancing Transformer performance in limited data scenarios.
Findings
GLU with sigmoid activation achieves 0.98 accuracy
Unsupervised pretraining enhances Transformer performance
GRN as an intermediary layer outperforms integration within Attention
Abstract
This study delves into the effectiveness of various learning methods in improving Transformer models, focusing particularly on the Gated Residual Network Transformer (GRN-Transformer) in the context of pediatric intensive care units (PICU) with limited data availability. Our findings indicate that Transformers trained via supervised learning are less effective compared to MLP, CNN, and LSTM networks in such environments. Yet, leveraging unsupervised and self-supervised learning on unannotated data, with subsequent fine-tuning on annotated data, notably enhances Transformer performance, although not to the level of the GRN-Transformer. Central to our research is the analysis of different activation functions for the Gated Linear Unit (GLU), a crucial element of the GRN structure. We also employ Mutual Information Neural Estimation (MINE) to evaluate the GRN's contribution. Additionally,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNon-Invasive Vital Sign Monitoring · EEG and Brain-Computer Interfaces · Hemodynamic Monitoring and Therapy
MethodsAttention Is All You Need · Tanh Activation · Long Short-Term Memory · Softmax · Layer Normalization · Sigmoid Activation · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam
