TL;DR
This paper introduces RCFusion, an end-to-end recurrent convolutional architecture that effectively combines RGB and depth data to enhance object recognition accuracy in machine vision.
Contribution
The novel RCFusion model synergistically fuses multi-level RGB and depth features for improved RGB-D object recognition performance.
Findings
Outperforms state-of-the-art methods on RGB-D Object Dataset
Achieves higher accuracy in object categorization
Excels in instance recognition tasks
Abstract
Providing machines with the ability to recognize objects like humans has always been one of the primary goals of machine vision. The introduction of RGB-D cameras has paved the way for a significant leap forward in this direction thanks to the rich information provided by these sensors. However, the machine vision community still lacks an effective method to synergically use the RGB and depth data to improve object recognition. In order to take a step in this direction, we introduce a novel end-to-end architecture for RGB-D object recognition called recurrent convolutional fusion (RCFusion). Our method generates compact and highly discriminative multi-modal features by combining complementary RGB and depth information representing different levels of abstraction. Extensive experiments on two popular datasets, RGB-D Object Dataset and JHUIT-50, show that RCFusion significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
