Visual Interest Prediction with Attentive Multi-Task Transfer Learning
Deepanway Ghosal, Maheshkumar H. Kolekar

TL;DR
This paper introduces a neural network model that uses transfer learning and attention mechanisms within a multi-task framework to accurately predict visual interest and affective responses in digital photos, outperforming existing methods.
Contribution
It presents a novel multi-task transfer learning approach with attention mechanisms for predicting visual interest and affect, achieving significant improvements over prior state-of-the-art models.
Findings
Large performance improvement over existing systems
Effective multi-task learning for affect prediction
Validation on benchmark dataset confirms model's superiority
Abstract
Visual interest & affect prediction is a very interesting area of research in the area of computer vision. In this paper, we propose a transfer learning and attention mechanism based neural network model to predict visual interest & affective dimensions in digital photos. Learning the multi-dimensional affects is addressed through a multi-task learning framework. With various experiments we show the effectiveness of the proposed approach. Evaluation of our model on the benchmark dataset shows large improvement over current state-of-the-art systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
