Seeing the Undefined: Chain-of-Action for Generative Semantic Labels

Meng Wei; Zhongnian Li; Peng Ying; Xinzheng Xu

arXiv:2411.17406·cs.CV·September 16, 2025

Seeing the Undefined: Chain-of-Action for Generative Semantic Labels

Meng Wei, Zhongnian Li, Peng Ying, Xinzheng Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Generative Semantic Labels (GSLs), a new task for generating comprehensive, undefined semantic labels for images, and proposes Chain-of-Action (CoA), a method that improves label generation by sequentially enriching contextual information.

Contribution

The paper presents GSLs as a novel task and introduces CoA, a new method that decomposes label generation into sequential actions to enhance contextual understanding and accuracy.

Findings

01

CoA significantly improves semantic label accuracy.

02

GSLs enables richer image content representation.

03

Method outperforms existing approaches on benchmark datasets.

Abstract

Recent advances in vision-language models (VLMs) have demonstrated remarkable capabilities in image classification by leveraging predefined sets of labels to construct text prompts for zero-shot reasoning. However, these approaches face significant limitations in undefined domains, where the label space is vocabulary-unknown and composite. We thus introduce Generative Semantic Labels (GSLs), a novel task that aims to predict a comprehensive set of semantic labels for an image without being constrained by a predefined labels set. Unlike traditional zero-shot classification, GSLs generates multiple semantic-level labels, encompassing objects, scenes, attributes, and relationships, thereby providing a richer and more accurate representation of image content. In this paper, we propose Chain-of-Action (CoA), an innovative method designed to tackle the GSLs task. CoA is motivated by the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WilsonMqz/CoA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies

MethodsSparse Evolutionary Training