FE-Adapter: Adapting Image-based Emotion Classifiers to Videos
Shreyank N Gowda, Boyan Gao, David A. Clifton

TL;DR
This paper introduces FE-Adapter, a parameter-efficient method for adapting image-based emotion classifiers to videos, significantly reducing parameters while maintaining or improving accuracy in video emotion recognition.
Contribution
The study proposes a novel cross-modality transfer learning approach with FE-Adapter, enabling efficient adaptation of image models to video tasks, outperforming previous methods in parameter efficiency and accuracy.
Findings
FE-Adapter uses about 15 times fewer parameters than previous methods.
It achieves comparable or better accuracy in video emotion recognition.
The approach demonstrates the potential of cross-modality transfer learning in video analysis.
Abstract
Utilizing large pre-trained models for specific tasks has yielded impressive results. However, fully fine-tuning these increasingly large models is becoming prohibitively resource-intensive. This has led to a focus on more parameter-efficient transfer learning, primarily within the same modality. But this approach has limitations, particularly in video understanding where suitable pre-trained models are less common. Addressing this, our study introduces a novel cross-modality transfer learning approach from images to videos, which we call parameter-efficient image-to-video transfer learning. We present the Facial-Emotion Adapter (FE-Adapter), designed for efficient fine-tuning in video tasks. This adapter allows pre-trained image models, which traditionally lack temporal processing capabilities, to analyze dynamic video content efficiently. Notably, it uses about 15 times fewer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAdapter · Focus
