PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications

Stephen Mell; David Mell; Konstantinos Kallas; Steve Zdancewic; Osbert Bastani

arXiv:2605.18697·cs.DC·May 19, 2026

PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications

Stephen Mell, David Mell, Konstantinos Kallas, Steve Zdancewic, Osbert Bastani

PDF

TL;DR

PopPy is a system that automatically finds and exploits parallelism in Python-based compound AI applications, significantly reducing their execution time without altering program semantics.

Contribution

It introduces a novel approach combining ahead-of-time compilation and runtime analysis to uncover parallelism in complex Python applications invoking external AI components.

Findings

01

Achieves up to 6.4x speedup in real-world AI applications

02

Supports a broad subset of Python with minimal developer effort

03

Preserves sequential semantics while optimizing performance

Abstract

Compound AI applications, which compose calls to ML models using a general-purpose programming language like Python, are widely used for a variety of user-facing tasks, from software engineering to enterprise automation, making their end-to-end latency a critical bottleneck. In contrast to traditional applications, execution time is dominated by the external components, which cannot be handled by traditional language optimization systems, like optimizing compilers. To address this problem, we develop PopPy, a system that can uncover parallelization opportunities in Python applications that invoke these heavy external components, including those used in compound AI applications. PopPy supports a very expressive fragment of Python and requires minimal developer input to uncover parallelism. It combines an ahead-of-time compiler with a runtime, addressing three key challenges in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.