How to Read Paintings: Semantic Art Understanding with Multi-Modal   Retrieval

Noa Garcia; George Vogiatzis

arXiv:1810.09617·cs.CV·October 24, 2018

How to Read Paintings: Semantic Art Understanding with Multi-Modal Retrieval

Noa Garcia, George Vogiatzis

PDF

TL;DR

This paper introduces SemArt, a multi-modal dataset and retrieval challenge for semantic art understanding, enabling retrieval of paintings based on textual descriptions and vice versa, with models showing promising results.

Contribution

The paper presents SemArt, a novel dataset and multi-modal retrieval task for semantic art understanding, along with models that encode visual and textual art representations into a shared semantic space.

Findings

01

Best model retrieves correct images within top 10 in 45.5% of cases

02

Models outperform baseline in multi-modal retrieval tasks

03

High correlation with human art understanding evaluation

Abstract

Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.