COSMO: Contextualized Scene Modeling with Boltzmann Machines

Ilker Bozcan; Sinan Kalkan

arXiv:1807.00511·cs.RO·December 20, 2018

COSMO: Contextualized Scene Modeling with Boltzmann Machines

Ilker Bozcan, Sinan Kalkan

PDF

1 Repo

TL;DR

This paper introduces COSMO, a novel Boltzmann Machine-based model that integrates objects, relations, and affordances for comprehensive scene understanding and generation in robotics.

Contribution

It presents the first hybrid Boltzmann Machine model combining objects, relations, and affordances with shared tri-way connections for scene modeling.

Findings

01

Outperforms baselines in object and relation estimation tasks

02

Demonstrates ability to generate realistic scene examples

03

Provides a new dataset for relation estimation studies

Abstract

Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this paper, we adapt and extend Boltzmann Machines (BMs) for contextualized scene modeling. Although there are many models on the subject, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are introduced with shared, tri-way connections into the model. Moreover, we contribute a dataset for relation estimation and modeling studies. We evaluate our method in comparison with several baselines on object estimation, out-of-context object detection, relation estimation, and affordance estimation tasks. Moreover, to illustrate the generative capability of the model, we show several example scenes that the model is able to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bozcani/COSMO
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.