Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Chaofan Lin, Zhenhua Han, Chengruidong Zhang, Yuqing Yang, Fan Yang,, Chen Chen, Lili Qiu

TL;DR
Parrot introduces Semantic Variables to expose application-level knowledge to LLM services, enabling end-to-end optimization and significantly improving the performance of LLM-based applications.
Contribution
It proposes Semantic Variables as a novel abstraction to enhance public LLM services with application-level context, improving overall application performance.
Findings
Achieves up to tenfold performance improvements in LLM applications.
Enables data flow analysis across multiple LLM requests.
Demonstrates significant benefits in practical use cases.
Abstract
The rise of large language models (LLMs) has enabled LLM-based applications (a.k.a. AI agents or co-pilots), a new software paradigm that combines the strength of LLM and conventional software. Diverse LLM applications from different tenants could design complex workflows using multiple LLM requests to accomplish one task. However, they have to use the over-simplified request-level API provided by today's public LLM services, losing essential application-level information. Public LLM services have to blindly optimize individual LLM requests, leading to sub-optimal end-to-end performance of LLM applications. This paper introduces Parrot, an LLM service system that focuses on the end-to-end experience of LLM-based applications. Parrot proposes Semantic Variable, a unified abstraction to expose application-level knowledge to public LLM services. A Semantic Variable annotates an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security
Methodstravel james · Attention Is All You Need · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Softmax · Linear Layer · Parrot
