Compilation and Execution of an Embeddable YOLO-NAS on the VTA

Anthony Faure-Gignoux; Kevin Delmas; Adrien Gauffriau; Claire Pagetti

arXiv:2604.24455·cs.AR·April 28, 2026

Compilation and Execution of an Embeddable YOLO-NAS on the VTA

Anthony Faure-Gignoux, Kevin Delmas, Adrien Gauffriau, Claire Pagetti

PDF

TL;DR

This paper extends and automates the VTA compiler to enable complete CNN compilation, supporting larger models like YOLO-NAS for FPGA deployment in safety-critical applications.

Contribution

It introduces a fully automated VTA compilation chain capable of handling larger CNNs, overcoming previous limitations.

Findings

01

Successful compilation of YOLO-NAS on VTA

02

Demonstrated simulated execution of the model

03

Enhanced compiler supports larger CNNs

Abstract

Deploying complex Convolutional Neural Networks (CNNs) on FPGA-based accelerators is a promising way forward for safety-critical domains such as aeronautics. In a previous work, we have explored the Versatile Tensor Accelerator (VTA) and showed its suitability for avionic applications. For that, we developed an initial stand-alone compiler designed with certification in mind. However, this compiler still suffers from some limitations that are overcome in this paper. The contributions consist in extending and fully automating the VTA compilation chain to allow complete CNN compilation and support larger CNNs (which parameters do not fit in the on-chip memory). The effectiveness is demonstrated by the successful compilation and simulated execution of a YOLO-NAS object detection model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.