Kairos: A Scalable Serving System for Physical AI
Yinwei Dai, Ganesh Ananthanarayanan, Landon Cox, Xenofon Foukas, Bozidar Radunovic, Ravi Netravali

TL;DR
Kairos is a novel scalable serving system tailored for Physical AI, effectively managing multi-round inference and action execution, significantly reducing latency compared to traditional digital AI serving systems.
Contribution
It introduces Kairos, the first multi-robot serving system that integrates inference and execution phases, addressing the unique requirements of Physical AI tasks.
Findings
Reduces end-to-end task latency by 31.8-66.5%
Scales effectively with robot fleet size
Outperforms state-of-the-art digital AI serving practices
Abstract
Physical AI is experiencing rapid growth with frontier foundation models increasing its capabilities across general environments. Physical AI tasks are characterized by inference properties that are markedly different from digital AI. They consist of multiple rounds of inference and action execution, generating a chunk of actions in each inference round, and asynchronously interleaving inference and execution. This makes existing digital AI serving systems unsuited for physical AI; a shortcoming that is critical for enabling their wide adoption, considering their size and the scale of the robot fleets they have to serve. To fill this gap, we design Kairos, the first multi-robot serving system that makes the generate-execute loop a first-class citizen, with active involvement in the execution phase. Across a wide range of physical AI models and robots, Kairos reduces the average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
