Triage: Routing Software Engineering Tasks to Cost-Effective LLM Tiers via Code Quality Signals
Lech Madeyski

TL;DR
Triage is a framework that uses code quality signals to route software engineering tasks to the most cost-effective LLM tier while maintaining output quality.
Contribution
It introduces a novel routing method leveraging code health metrics to optimize LLM usage costs in software engineering tasks.
Findings
Cost-effective routing depends on code health pass rates exceeding cost ratios.
Code health metrics can effectively discriminate the required model tier.
Analytical conditions identify when tier asymmetry benefits routing decisions.
Abstract
Context: AI coding agents route every task to a single frontier large language model (LLM), paying premium inference cost even when many tasks are routine. Objectives: We propose Triage, a framework that uses code health metrics -- indicators of software maintainability -- as a routing signal to assign each task to the cheapest model tier whose output passes the same verification gate as the expensive model. Methods: Triage defines three capability tiers (light, standard, heavy -- mirroring, e.g., Haiku, Sonnet, Opus) and routes tasks based on pre-computed code health sub-factors and task metadata. We design an evaluation comparing three routing policies on SWE-bench Lite (300 tasks across three model tiers): heuristic thresholds, a trained ML classifier, and a perfect-hindsight oracle. Results: We analytically derived two falsifiable conditions under which the tier-dependent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
