ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device

Mergen Nachin; Digant Desai; Sicheng Stephen Jia; Chen Lai; Mengwei Liu; Jacob Szwejbka; Raziel Alvarez; RJ Ascani; Dave Bort; Manuel Candales; Andrew Caples; Yanan Cao; Zhengxu Chen; Soumith Chintala; Gregory Comer; Tanvir Islam; Songhao Jia; Tarun Karuturi; Jack Khuu; Abhinay Kukkadapu; Tugsbayasgalan Manlaibaatar; Andrew Or; Kimish Patel; Siddartha Pothapragada; Lucy Qiu; Supriya Rao; Orion Reblitz-Richardson; Max Ren; Scott Roy; Anthony Shoumikhin; Scott Wolchok; Guang Yang; Angela Yi; Martin Yuan; Hansong Zhang; Jack Zhang; Jerry Zhang; Shunting Zhang; and C. Cagatay Bilgin

arXiv:2605.08195·cs.LG·May 12, 2026

ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device

Mergen Nachin, Digant Desai, Sicheng Stephen Jia, Chen Lai, Mengwei Liu, Jacob Szwejbka, Raziel Alvarez, RJ Ascani, Dave Bort, Manuel Candales, Andrew Caples, Yanan Cao, Zhengxu Chen, Soumith Chintala, Gregory Comer, Tanvir Islam, Songhao Jia, Tarun Karuturi, Jack Khuu

PDF

TL;DR

ExecuTorch is a unified, PyTorch-native framework that simplifies deploying AI models across diverse edge hardware, supporting customization, optimization, and seamless experimentation within PyTorch.

Contribution

It introduces a comprehensive deployment solution that maintains PyTorch semantics and supports heterogeneous hardware, bridging research and production workflows.

Findings

01

Supports deployment from microcontrollers to large SoCs.

02

Enables validation of deployment behavior within PyTorch.

03

Supports optimizations like quantization and customizable backends.

Abstract

Local execution of AI on edge devices is important for low latency and offline operation. However, deploying models on diverse hardware remains fragmented, often requiring model conversion or complete reimplementation outside the PyTorch ecosystem where the model was originally authored. We introduce ExecuTorch, a unified PyTorch-native deployment framework for edge AI. ExecuTorch enables seamless deployment of machine learning models across heterogeneous compute environments. It scales from embedded microcontrollers to complex system-on-chips (SoCs) with dedicated accelerators, powering devices ranging from wearables and smartphones to large compute clusters. ExecuTorch preserves PyTorch semantics while allowing customization, support for optimizations like quantization, and pluggable execution "backends". These features together enable fast experimentation, allowing researchers to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.