Bimodal SegNet: Instance Segmentation Fusing Events and RGB Frames for Robotic Grasping
Sanket Kachole, Xiaoqian Huang, Fariborz Baghaei Naeini, Rajkumar Muthusamy, Dimitrios Makris, Yahya Zweiri

TL;DR
This paper introduces Bimodal SegNet, a deep learning model that combines event-based data and RGB images for improved object segmentation in robotic grasping under challenging conditions.
Contribution
It presents a novel dual-encoder architecture with spatial pyramidal pooling for fusing event and RGB data, enhancing segmentation accuracy in dynamic scenarios.
Findings
Achieves 6-10% higher segmentation accuracy than existing methods.
Performs well under occlusion, blur, and scale variations.
Demonstrates robustness across five challenging image conditions.
Abstract
Object segmentation for robotic grasping under dynamic conditions often faces challenges such as occlusion, low light conditions, motion blur and object size variance. To address these challenges, we propose a Deep Learning network that fuses two types of visual signals, event-based data and RGB frame data. The proposed Bimodal SegNet network has two distinct encoders, one for each signal input and a spatial pyramidal pooling with atrous convolutions. Encoders capture rich contextual information by pooling the concatenated features at different resolutions while the decoder obtains sharp object boundaries. The evaluation of the proposed method undertakes five unique image degradation challenges including occlusion, blur, brightness, trajectory and scale variance on the Event-based Segmentation (ESD) Dataset. The evaluation results show a 6-10\% segmentation accuracy improvement over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Neural Network Applications · EEG and Brain-Computer Interfaces
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Max Pooling · Kaiming Initialization · Softmax · Convolution · SegNet
