RoboEgo System Card: An Omnimodal Model with Native Full Duplexity
Yiqun Yao, Xiang Li, Xin Jiang, Xuezhi Fang, Naitong Yu, Aixin Sun, Yequan Wang

TL;DR
RoboEgo is a unified omnimodal model that supports native full duplexity, enabling rapid, natural multimodal interactions with low latency, advancing AI's ability to process real-world, multimodal conversations.
Contribution
It introduces RoboEgo, a novel model architecture that natively supports full duplex multimodal processing with low latency, addressing key challenges in real-time, multimodal AI interactions.
Findings
Achieves a duplex latency of 80 ms.
Demonstrates superior responsiveness and speech naturalness.
Maintains content quality comparable to semi-duplex models.
Abstract
Humans naturally process real-world multimodal information in a full-duplex manner. In artificial intelligence, replicating this capability is essential for advancing model development and deployment, particularly in embodied contexts. The development of multimodal models faces two primary challenges: (1) effectively handling more than three modalities-such as vision, audio, and text; and (2) delivering full-duplex responses to rapidly evolving human instructions. To facilitate research on models that support both omnimodal processing and full duplexity, we present RoboEgo (alias: FLM-Ego), a unified model system designed to address both challenges. RoboEgo incorporates a backbone architecture and algorithms that natively support full duplexity, achieving a theoretical duplex latency of 80 ms. In streaming visually grounded conversations under real-world conditions, RoboEgo exhibits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Cellular Automata and Applications · Computability, Logic, AI Algorithms
