The Focus-Aspect-Polarity Model for Predicting Subjective Noun   Attributes in Images

Tushar Karayil; Philipp Blandfort; J\"orn Hees; Andreas Dengel

arXiv:1810.06219·cs.CV·October 16, 2018

The Focus-Aspect-Polarity Model for Predicting Subjective Noun Attributes in Images

Tushar Karayil, Philipp Blandfort, J\"orn Hees, Andreas Dengel

PDF

Open Access

TL;DR

This paper introduces the Focus-Aspect-Polarity model for better capturing subjective attributes in images, along with a new dataset, demonstrating improved deep learning performance by incorporating contextual information.

Contribution

The paper proposes a novel structured model for subjective image interpretation and provides a new dataset, enhancing the understanding of attribute semantics in computer vision.

Findings

01

Tensor multiplication-based context integration outperforms concatenation in several cases.

02

The new dataset enables fine-grained subjective attribute labeling.

03

Deep learning methods benefit from structured context modeling.

Abstract

Subjective visual interpretation is a challenging yet important topic in computer vision. Many approaches reduce this problem to the prediction of adjective- or attribute-labels from images. However, most of these do not take attribute semantics into account, or only process the image in a holistic manner. Furthermore, there is a lack of relevant datasets with fine-grained subjective labels. In this paper, we propose the Focus-Aspect-Polarity model to structure the process of capturing subjectivity in image processing, and introduce a novel dataset following this way of modeling. We run experiments on this dataset to compare several deep learning methods and find that incorporating context information based on tensor multiplication in several cases outperforms the default way of information fusion (concatenation).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques