Pix2Code: Learning to Compose Neural Visual Concepts as Programs
Antonia W\"ust, Wolfgang Stammer, Quentin Delfosse, Devendra Singh, Dhami, Kristian Kersting

TL;DR
Pix2Code introduces a hybrid framework combining symbolic and neural methods to learn and interpret visual relational concepts as programs, enabling better generalization and human interpretability.
Contribution
It extends program synthesis to visual reasoning, integrating symbolic and neural representations for interpretable and generalizable visual concept learning.
Findings
Pix2Code effectively identifies compositional visual concepts.
The framework generalizes well to novel data and configurations.
Representations are human interpretable and easily revisable.
Abstract
The challenge in learning abstract concepts from images in an unsupervised fashion lies in the required integration of visual perception and generalizable relational reasoning. Moreover, the unsupervised nature of this task makes it necessary for human users to be able to understand a model's learnt concepts and potentially revise false behaviours. To tackle both the generalizability and interpretability constraints of visual concept learning, we propose Pix2Code, a framework that extends program synthesis to visual relational reasoning by utilizing the abilities of both explicit, compositional symbolic and implicit neural representations. This is achieved by retrieving object representations from images and synthesizing relational concepts as lambda-calculus programs. We evaluate the diverse properties of Pix2Code on the challenging reasoning domains, Kandinsky Patterns and CURI,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Cell Image Analysis Techniques
