Argus: Quality-Aware High-Throughput Text-to-Image Inference Serving System

Shubham Agarwal; Subrata Mitra; Saud Iqbal

arXiv:2511.06724·cs.CV·November 11, 2025

Argus: Quality-Aware High-Throughput Text-to-Image Inference Serving System

Shubham Agarwal, Subrata Mitra, Saud Iqbal

PDF

Open Access

TL;DR

Argus is a system that improves text-to-image inference throughput by dynamically selecting approximation levels for each prompt, reducing latency violations and increasing quality and throughput.

Contribution

It introduces a novel approach to adaptively choose approximation strategies for T2I models, balancing quality and throughput in high-demand settings.

Findings

01

Achieves 10x fewer latency SLO violations

02

Provides 10% higher average quality

03

Increases throughput by 40%

Abstract

Text-to-image (T2I) models have gained significant popularity. Most of these are diffusion models with unique computational characteristics, distinct from both traditional small-scale ML models and large language models. They are highly compute-bound and use an iterative denoising process to generate images, leading to very high inference time. This creates significant challenges in designing a high-throughput system. We discovered that a large fraction of prompts can be served using faster, approximated models. However, the approximation setting must be carefully calibrated for each prompt to avoid quality degradation. Designing a high-throughput system that assigns each prompt to the appropriate model and compatible approximation setting remains a challenging problem. We present Argus, a high-throughput T2I inference system that selects the right level of approximation for each prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Cell Image Analysis Techniques