Cornfigurator: Automated Planning for Any-to-Any Multimodal Model Serving
Jeff J. Ma, Jae-Won Chung, Jisang Ahn, Yizhuo Liang, Runyu Lu, Akshay Jajoo, Myungjin Lee, Mosharaf Chowdhury

TL;DR
Cornfigurator is a novel deployment planner that optimizes inference serving for generic Any-to-Any multimodal models, significantly improving throughput while meeting latency targets.
Contribution
It introduces the first automated deployment planning system for generic Any-to-Any models, handling heterogeneous computation paths and scaling characteristics.
Findings
Plans match or outperform existing systems by 1.12× to 6.32× in goodput.
Uses coarse-to-fine evaluation to efficiently explore deployment strategies.
Achieves higher throughput while meeting latency targets.
Abstract
Any-to-Any models are an emerging class of multimodal models that accept combinations of text and multimodal data as input and generate them as output, introducing heterogeneous computation paths and component scaling characteristics. There are existing mechanisms for deploying Any-to-Any models--or special cases of them--for inference serving, but they either require manual effort and expertise to tune, or do not generalize to generic Any-to-Any models. We present Cornfigurator, the first deployment planner for generic Any-to-Any model inference serving. The goal of Cornfigurator is to maximize the overall goodput of serving the model, defined as the throughput of requests meeting their latency targets. To do so, based on model and workload characteristics, Cornfigurator explores the full spectrum of deployment strategies, from colocation to disaggregation and mixing different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
