FrankenBot: Brain-Morphic Modular Orchestration for Robotic Manipulation with Vision-Language Models
Shiyi Wang, Wenbo Li, Yiteng Chen, Qingyao Wu, Huiping Zhuang

TL;DR
FrankenBot is a brain-inspired, modular robotic manipulation framework leveraging vision-language models to integrate multiple cognitive functions, achieving high efficiency and robustness in complex environments without retraining.
Contribution
The paper introduces FrankenBot, a novel VLM-driven, brain-morphic architecture that unifies key robotic functions into a cohesive system inspired by human brain structure, enhancing efficiency and capability.
Findings
Improves anomaly detection and handling in robotic manipulation.
Enables long-term memory and stable operation without retraining.
Demonstrates superior performance in simulation and real-world tests.
Abstract
Developing a general robot manipulation system capable of performing a wide range of tasks in complex, dynamic, and unstructured real-world environments has long been a challenging task. It is widely recognized that achieving human-like efficiency and robustness manipulation requires the robotic brain to integrate a comprehensive set of functions, such as task planning, policy generation, anomaly monitoring and handling, and long-term memory, achieving high-efficiency operation across all functions. Vision-Language Models (VLMs), pretrained on massive multimodal data, have acquired rich world knowledge, exhibiting exceptional scene understanding and multimodal reasoning capabilities. However, existing methods typically focus on realizing only a single function or a subset of functions within the robotic brain, without integrating them into a unified cognitive architecture. Inspired by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · EEG and Brain-Computer Interfaces · Memory and Neural Mechanisms
