Towards Resource-Efficient Compound AI Systems
Gohar Irfan Chaudhry, Esha Choukse, \'I\~nigo Goiri, Rodrigo Fonseca,, Adam Belay, Ricardo Bianchini

TL;DR
This paper introduces Murakkab, a prototype system for resource-efficient Compound AI Systems that uses a declarative workflow model and adaptive runtime to improve efficiency and resource management.
Contribution
It proposes a novel decoupled workflow programming model and adaptive runtime system for resource-aware scheduling in Compound AI Systems, demonstrated through a prototype implementation.
Findings
Up to 3.4x faster workflow completion times
Approximately 4.5x higher energy efficiency
Effective resource optimization in AI workflows
Abstract
Compound AI Systems, integrating multiple interacting components like models, retrievers, and external tools, have emerged as essential for addressing complex AI tasks. However, current implementations suffer from inefficient resource utilization due to tight coupling between application logic and execution details, a disconnect between orchestration and resource management layers, and the perceived exclusiveness between efficiency and quality. We propose a vision for resource-efficient Compound AI Systems through a declarative workflow programming model and an adaptive runtime system for dynamic scheduling and resource-aware decision-making. Decoupling application logic from low-level details exposes levers for the runtime to flexibly configure the execution environment and resources, without compromising on quality. Enabling collaboration between the workflow orchestration and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Cloud Computing and Resource Management · Fault Detection and Control Systems
