Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun   Rebus Art Understanding

Tuo Zhang; Tiantian Feng; Yibin Ni; Mengqin Cao; Ruying Liu; Katharine; Butler; Yanjun Weng; Mi Zhang; Shrikanth S. Narayanan; Salman Avestimehr

arXiv:2406.10318·cs.CV·June 18, 2024

Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding

Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Katharine, Butler, Yanjun Weng, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces the Pun Rebus Art Dataset, a multimodal Chinese cultural art dataset, to evaluate and improve vision-language models' understanding of traditional Chinese rebus art and its symbolic meanings.

Contribution

The paper presents a new culturally rich dataset for Chinese rebus art understanding and highlights the limitations of current VLMs in interpreting such art forms.

Findings

01

State-of-the-art VLMs struggle with Chinese rebus tasks.

02

Existing models often produce biased and hallucinated explanations.

03

Limited improvement observed through in-context learning.

Abstract

Large vision-language models (VLMs) have demonstrated remarkable abilities in understanding everyday content. However, their performance in the domain of art, particularly culturally rich art forms, remains less explored. As a pearl of human wisdom and creativity, art encapsulates complex cultural narratives and symbolism. In this paper, we offer the Pun Rebus Art Dataset, a multimodal dataset for art understanding deeply rooted in traditional Chinese culture. We focus on three primary tasks: identifying salient visual elements, matching elements with their symbolic meanings, and explanations for the conveyed messages. Our evaluation reveals that state-of-the-art VLMs struggle with these tasks, often providing biased and hallucinated explanations and showing limited improvement through in-context learning. By releasing the Pun Rebus Art Dataset, we aim to facilitate the development of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhang-tuo-pdf/Pun-Rebus-Art-Benchmark
noneOfficial

Videos

Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding· underline

Taxonomy

TopicsCultural Heritage Management and Preservation

MethodsFocus