CoCoG: Controllable Visual Stimuli Generation based on Human Concept Representations
Chen Wei, Jiachen Zou, Dietmar Heinke, Quanying Liu

TL;DR
CoCoG introduces a novel AI framework that extracts human-interpretable concepts from visual stimuli, predicts human similarity judgments, and generates controllable visual objects to enhance understanding of human cognition.
Contribution
The paper presents the first AI model capable of extracting human concept representations, predicting human behavior, and generating visual stimuli with controllable concepts.
Findings
Achieves 64.07% accuracy in predicting human similarity judgments
Generates diverse visual objects controlled by concepts
Manipulates human similarity judgments by intervening key concepts
Abstract
A central question for cognitive science is to understand how humans process visual objects, i.e, to uncover human low-dimensional concept representation space from high-dimensional visual stimuli. Generating visual stimuli with controlling concepts is the key. However, there are currently no generative models in AI to solve this problem. Here, we present the Concept based Controllable Generation (CoCoG) framework. CoCoG consists of two components, a simple yet efficient AI agent for extracting interpretable concept and predicting human decision-making in visual similarity judgment tasks, and a conditional generation model for generating visual stimuli given the concepts. We quantify the performance of CoCoG from two aspects, the human behavior prediction accuracy and the controllable generation ability. The experiments with CoCoG indicate that 1) the reliable concept embeddings in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
