Aligning Visual and Lexical Semantics

Fausto Giunchiglia; Mayukh Bagchi; Xiaolei Diao

arXiv:2212.06629·cs.CV·December 14, 2022

Aligning Visual and Lexical Semantics

Fausto Giunchiglia, Mayukh Bagchi, Xiaolei Diao

PDF

Open Access

TL;DR

This paper explores the disconnect between visual and lexical semantics in computer vision, proposing a domain-agnostic method to align these two types of semantics to address the Semantic Gap Problem.

Contribution

It introduces a novel, domain-agnostic methodology to align visual and lexical semantics, aiming to bridge the Semantic Gap in computer vision systems.

Findings

01

Demonstrates the lack of coincidence between visual and lexical semantics.

02

Proposes a general methodology for semantic alignment.

03

Highlights potential improvements in CV system understanding.

Abstract

We discuss two kinds of semantics relevant to Computer Vision (CV) systems - Visual Semantics and Lexical Semantics. While visual semantics focus on how humans build concepts when using vision to perceive a target reality, lexical semantics focus on how humans build concepts of the same target reality through the use of language. The lack of coincidence between visual and lexical semantics, in turn, has a major impact on CV systems in the form of the Semantic Gap Problem (SGP). The paper, while extensively exemplifying the lack of coincidence as above, introduces a general, domain-agnostic methodology to enforce alignment between visual and lexical semantics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCategorization, perception, and language · Constraint Satisfaction and Optimization · Visual Attention and Saliency Detection