Multimodal Metadata Assignment for Cultural Heritage Artifacts

Luis Rei; Dunja Mladeni\'c; Mareike Dorozynski; Franz; Rottensteiner; Thomas Schleider; Rapha\"el Troncy; Jorge Sebasti\'an; Lozano; Mar Gait\'an Salvatella

arXiv:2406.00423·cs.CV·June 4, 2024

Multimodal Metadata Assignment for Cultural Heritage Artifacts

Luis Rei, Dunja Mladeni\'c, Mareike Dorozynski, Franz, Rottensteiner, Thomas Schleider, Rapha\"el Troncy, Jorge Sebasti\'an, Lozano, Mar Gait\'an Salvatella

PDF

1 Repo

TL;DR

This paper presents a multimodal classification system for cultural heritage artifacts, combining image, text, and tabular data using deep learning and knowledge graphs, achieving high accuracy in property prediction.

Contribution

It introduces a novel multimodal classifier with a late fusion approach and a new dataset leveraging knowledge graphs for cultural heritage artifacts.

Findings

01

Multimodal approach outperforms individual classifiers.

02

High accuracy in predicting missing artifact properties.

03

Effective integration of deep learning and knowledge graphs.

Abstract

We develop a multimodal classifier for the cultural heritage domain using a late fusion approach and introduce a novel dataset. The three modalities are Image, Text, and Tabular data. We based the image classifier on a ResNet convolutional neural network architecture and the text classifier on a multilingual transformer architecture (XML-Roberta). Both are trained as multitask classifiers and use the focal loss to handle class imbalance. Tabular data and late fusion are handled by Gradient Tree Boosting. We also show how we leveraged specific data models and taxonomy in a Knowledge Graph to create the dataset and to store classification results. All individual classifiers accurately predict missing properties in the digitized silk artifacts, with the multimodal approach providing the best results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

silknow/multimodal_cultural_heritage
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution · Kaiming Initialization · Max Pooling · Average Pooling · Global Average Pooling · Focal Loss