MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy
Yuhao Chen, Hayden Gunraj, E. Zhixuan Zeng, Robbie Meyer, Maximilian, Gilles, Alexander Wong

TL;DR
This paper introduces MMRNet, a multimodal redundancy system for robotic bin picking that enhances reliability by using multiple sensor modalities, a novel ensemble approach, and a new consistency score to improve detection and segmentation robustness.
Contribution
The paper presents the first multimodal redundancy framework with a gate fusion module, dynamic ensemble learning, and a label-free consistency score for improved reliability in robotic bin picking.
Findings
System outperforms baseline models during sensor failure events.
MC score effectively indicates output reliability and uncertainty.
Multimodal redundancy enhances robustness in real-world deployment.
Abstract
Recently, there has been tremendous interest in industry 4.0 infrastructure to address labor shortages in global supply chains. Deploying artificial intelligence-enabled robotic bin picking systems in real world has become particularly important for reducing stress and physical demands of workers while increasing speed and efficiency of warehouses. To this end, artificial intelligence-enabled robotic bin picking systems may be used to automate order picking, but with the risk of causing expensive damage during an abnormal event such as sensor failure. As such, reliability becomes a critical factor for translating artificial intelligence research to real world applications and products. In this paper, we propose a reliable object detection and segmentation system with MultiModal Redundancy (MMRNet) for tackling object detection and segmentation for robotic bin picking using data from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Advanced Manufacturing and Logistics Optimization · Robot Manipulation and Learning
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
