KAITIAN: A Unified Communication Framework for Enabling Efficient Collaboration Across Heterogeneous Accelerators in Embodied AI Systems
Jieke Lin, Wanyu Wang, Longxiang Yin, Yinhe Han

TL;DR
KAITIAN is a unified communication framework that enhances interoperability and efficiency across diverse accelerators in embodied AI systems, significantly improving training speed and resource utilization.
Contribution
It introduces a load-adaptive scheduling and unified communication abstraction that bridges vendor-specific libraries, enabling seamless collaboration among heterogeneous accelerators.
Findings
Up to 42% faster training times.
Minimal communication overhead (2.8--4.3%).
Maintains model accuracy in heterogeneous setups.
Abstract
Embodied Artificial Intelligence (AI) systems, such as autonomous robots and intelligent vehicles, are increasingly reliant on diverse heterogeneous accelerators (e.g., GPGPUs, NPUs, FPGAs) to meet stringent real-time processing and energy-efficiency demands. However, the proliferation of vendor-specific proprietary communication libraries creates significant interoperability barriers, hindering seamless collaboration between different accelerator types and leading to suboptimal resource utilization and performance bottlenecks in distributed AI workloads. This paper introduces KAITIAN, a novel distributed communication framework designed to bridge this gap. KAITIAN provides a unified abstraction layer that intelligently integrates vendor-optimized communication libraries for intra-group efficiency with general-purpose communication protocols for inter-group interoperability. Crucially,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence
