Multimodal Semantic Transfer from Text to Image. Fine-Grained Image   Classification by Distributional Semantics

Simon Donig; Maria Christoforaki; Bernhard Bermeitinger; Siegfried; Handschuh

arXiv:2001.02372·cs.CV·January 9, 2020·1 cites

Multimodal Semantic Transfer from Text to Image. Fine-Grained Image Classification by Distributional Semantics

Simon Donig, Maria Christoforaki, Bernhard Bermeitinger, Siegfried, Handschuh

PDF

Open Access

TL;DR

This paper introduces a multimodal approach that transfers semantic information from text to image classification, using distributional semantics to enhance fine-grained image categorization in digital humanities.

Contribution

It presents a novel method combining CNNs with distributional semantic vectors derived from domain-specific texts for improved image classification.

Findings

01

Enhanced semantic understanding in image classification

02

Better handling of small and high-dimensional data

03

Improved accuracy in fine-grained categories

Abstract

In the last years, image classification processes like neural networks in the area of art-history and Heritage Informatics have experienced a broad distribution (Lang and Ommer 2018). These methods face several challenges, including the handling of comparatively small amounts of data as well as high-dimensional data in the Digital Humanities. Here, a Convolutional Neural Network (CNN) is used that output is not, as usual, a series of flat text labels but a series of semantically loaded vectors. These vectors result from a Distributional Semantic Model (DSM) which is generated from an in-domain text corpus. ----- In den letzten Jahren hat die Verwendung von Bildklassifizierungsverfahren wie neuronalen Netzwerken auch im Bereich der historischen Bildwissenschaften und der Heritage Informatics weite Verbreitung gefunden (Lang und Ommer 2018). Diese Verfahren stehen dabei vor einer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Topic Modeling · Music and Audio Processing