Image interpretation by iterative bottom-up top-down processing
Shimon Ullman, Liav Assif, Alona Strugatski, Ben-Zion Vatashsky, Hila, Levy, Aviv Netanyahu, Adam Yaari

TL;DR
This paper presents an iterative bottom-up top-down model for scene understanding that combines visual extraction with stored knowledge, enabling flexible and generalizable interpretation of complex scenes.
Contribution
The model introduces a symmetric bidirectional BU-TD architecture with an algorithm for automatic instruction sequencing, improving scene interpretation and generalization capabilities.
Findings
Favorable combinatorial generalization to novel scene structures.
Effective integration of visual and non-visual information.
Comparison with human vision suggests biological plausibility.
Abstract
Scene understanding requires the extraction and representation of scene components together with their properties and inter-relations. We describe a model in which meaningful scene structures are extracted from the image by an iterative process, combining bottom-up (BU) and top-down (TD) networks, interacting through a symmetric bi-directional communication between them (counter-streams structure). The model constructs a scene representation by the iterative use of three components. The first model component is a BU stream that extracts selected scene elements, properties and relations. The second component (cognitive augmentation) augments the extracted visual representation based on relevant non-visual stored representations. It also provides input to the third component, the TD stream, in the form of a TD instruction, instructing the model what task to perform next. The TD stream…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Domain Adaptation and Few-Shot Learning
