Image interpretation by iterative bottom-up top-down processing

Shimon Ullman; Liav Assif; Alona Strugatski; Ben-Zion Vatashsky; Hila; Levy; Aviv Netanyahu; Adam Yaari

arXiv:2105.05592·cs.CV·May 13, 2021·1 cites

Image interpretation by iterative bottom-up top-down processing

Shimon Ullman, Liav Assif, Alona Strugatski, Ben-Zion Vatashsky, Hila, Levy, Aviv Netanyahu, Adam Yaari

PDF

Open Access 1 Repo

TL;DR

This paper presents an iterative bottom-up top-down model for scene understanding that combines visual extraction with stored knowledge, enabling flexible and generalizable interpretation of complex scenes.

Contribution

The model introduces a symmetric bidirectional BU-TD architecture with an algorithm for automatic instruction sequencing, improving scene interpretation and generalization capabilities.

Findings

01

Favorable combinatorial generalization to novel scene structures.

02

Effective integration of visual and non-visual information.

03

Comparison with human vision suggests biological plausibility.

Abstract

Scene understanding requires the extraction and representation of scene components together with their properties and inter-relations. We describe a model in which meaningful scene structures are extracted from the image by an iterative process, combining bottom-up (BU) and top-down (TD) networks, interacting through a symmetric bi-directional communication between them (counter-streams structure). The model constructs a scene representation by the iterative use of three components. The first model component is a BU stream that extracts selected scene elements, properties and relations. The second component (cognitive augmentation) augments the extracted visual representation based on relevant non-visual stored representations. It also provides input to the third component, the TD stream, in the form of a TD instruction, instructing the model what task to perform next. The TD stream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liavassif/BU-TD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Domain Adaptation and Few-Shot Learning