Transformer with Leveraged Masked Autoencoder for video-based Pain Assessment
Minh-Duc Nguyen, Hyung-Jeong Yang, Soo-Hyung Kim, Ji-Eun Shin, and, Seung-Won Kim

TL;DR
This paper introduces a Transformer-based model combined with a Masked Autoencoder to improve facial video analysis for pain assessment, aiming to provide an objective and effective healthcare tool.
Contribution
It presents a novel integration of Masked Autoencoder and Transformer architecture specifically for pain recognition from facial videos.
Findings
Achieved promising results on the AI4Pain dataset.
Enhanced pain level detection accuracy.
Demonstrated effectiveness of combined Autoencoder and Transformer approach.
Abstract
Accurate pain assessment is crucial in healthcare for effective diagnosis and treatment; however, traditional methods relying on self-reporting are inadequate for populations unable to communicate their pain. Cutting-edge AI is promising for supporting clinicians in pain recognition using facial video data. In this paper, we enhance pain recognition by employing facial video analysis within a Transformer-based deep learning model. By combining a powerful Masked Autoencoder with a Transformers-based classifier, our model effectively captures pain level indicators through both expressions and micro-expressions. We conducted our experiment on the AI4Pain dataset, which produced promising results that pave the way for innovative healthcare solutions that are both comprehensive and objective.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Thermography in Medicine · Medical Imaging and Analysis · Brain Tumor Detection and Classification
