Towards Visual Semantics

Fausto Giunchiglia; Luca Erculiani; Andrea Passerini

arXiv:2104.12379·cs.AI·September 15, 2021

Towards Visual Semantics

Fausto Giunchiglia, Luca Erculiani, Andrea Passerini

PDF

TL;DR

This paper proposes a theory and algorithm to map visual perception concepts to linguistic classification concepts, enabling better integration of visual data and natural language understanding, with promising initial results.

Contribution

It introduces a novel framework for building substance concepts aligned with classification concepts, using visual objects, hierarchy, and human feedback.

Findings

01

Algorithm learns Genus and Differentia with reasonable accuracy

02

Effective with limited examples and partial supervision

03

Preliminary experiments show promising results

Abstract

Lexical Semantics is concerned with how words encode mental representations of the world, i.e., concepts . We call this type of concepts, classification concepts . In this paper, we focus on Visual Semantics , namely on how humans build concepts representing what they perceive visually. We call this second type of concepts, substance concepts . As shown in the paper, these two types of concepts are different and, furthermore, the mapping between them is many-to-many. In this paper we provide a theory and an algorithm for how to build substance concepts which are in a one-to-one correspondence with classifications concepts, thus paving the way to the seamless integration between natural language descriptions and visual perception. This work builds upon three main intuitions: (i) substance concepts are modeled as visual objects , namely sequences of similar frames, as perceived in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.