Accelerating Language Model Workflows with Prompt Choreography

TJ Bai; Jason Eisner

arXiv:2512.23049·cs.CL·December 30, 2025

Accelerating Language Model Workflows with Prompt Choreography

TJ Bai, Jason Eisner

PDF

Open Access

TL;DR

Prompt Choreography is a framework that accelerates large language model workflows by using a dynamic cache, enabling faster execution and supporting parallel calls, with fine-tuning to maintain accuracy.

Contribution

The paper introduces Prompt Choreography, a novel framework that efficiently manages LLM workflows through caching and reordering, reducing latency and enabling parallel processing.

Findings

01

Significantly reduces per-message latency (2.0--6.2× faster).

02

Achieves over 2.2× end-to-end speedup in redundant workflows.

03

Fine-tuning helps maintain accuracy despite caching.

Abstract

Large language models are increasingly deployed in multi-agent workflows. We introduce Prompt Choreography, a framework that efficiently executes LLM workflows by maintaining a dynamic, global KV cache. Each LLM call can attend to an arbitrary, reordered subset of previously encoded messages. Parallel calls are supported. Though caching messages' encodings sometimes gives different results from re-encoding them in a new context, we show in diverse settings that fine-tuning the LLM to work with the cache can help it mimic the original results. Prompt Choreography significantly reduces per-message latency (2.0--6.2 $\times$ faster time-to-first-token) and achieves substantial end-to-end speedups ( $>$ 2.2 $\times$ ) in some workflows dominated by redundant computation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications