Caption Generation for Dongba Paintings via Prompt Learning and Semantic Fusion
Shuangwu Qian, Xiaochan Yuan, Pengfei Liu

TL;DR
This paper introduces PVGF-DPC, a novel framework for generating culturally accurate captions for Dongba paintings by integrating semantic prompts and fusion loss, addressing domain shift issues in traditional captioning models.
Contribution
It proposes a new encoder-decoder model with prompt learning and a fusion loss to improve captioning of culturally specific Dongba paintings, supported by a dedicated dataset.
Findings
Enhanced caption accuracy for Dongba paintings.
Effective integration of cultural cues into caption generation.
Improved semantic alignment between images and descriptions.
Abstract
Dongba paintings, the treasured pictorial legacy of the Naxi people in southwestern China, feature richly layered visual elements, vivid color palettes, and pronounced ethnic and regional cultural symbolism, yet their automatic textual description remains largely unexplored owing to severe domain shift when mainstream captioning models are applied directly. This paper proposes \textbf{PVGF-DPC} (\textit{Prompt and Visual Semantic-Generation Fusion-based Dongba Painting Captioning}), an encoder-decoder framework that integrates a content prompt module with a novel visual semantic-generation fusion loss to bridge the gap between generic natural-image captioning and the culturally specific imagery found in Dongba art. A MobileNetV2 encoder extracts discriminative visual features, which are injected into the layer normalization of a 10-layer Transformer decoder initialized with pretrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Aesthetic Perception and Analysis
