Is Your Image a Good Storyteller?
Xiujie Song, Xiaoyi Pang, Haifeng Tang, Mengyue Wu, Kenny Q. Zhu

TL;DR
This paper introduces the first dataset and method for automatically assessing the semantic richness of images, which is crucial for storytelling, cognitive assessment, and AI development.
Contribution
It presents the novel ISA dataset and a language-based approach to evaluate image semantic complexity, addressing a previously overlooked aspect.
Findings
Our method effectively predicts semantic richness of images.
Semantic assessment correlates with storytelling and cognitive utility.
The dataset enables further research in image semantics.
Abstract
Quantifying image complexity at the entity level is straightforward, but the assessment of semantic complexity has been largely overlooked. In fact, there are differences in semantic complexity across images. Images with richer semantics can tell vivid and engaging stories and offer a wide range of application scenarios. For example, the Cookie Theft picture is such a kind of image and is widely used to assess human language and cognitive abilities due to its higher semantic complexity. Additionally, semantically rich images can benefit the development of vision models, as images with limited semantics are becoming less challenging for them. However, such images are scarce, highlighting the need for a greater number of them. For instance, there is a need for more images like Cookie Theft to cater to people from different cultural backgrounds and eras. Assessing semantic complexity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDigital Imaging in Medicine
