MetaML-Pro: Cross-Stage Design Flow Automation for Efficient Deep Learning Acceleration

Zhiqiang Que; Jose G. F. Coutinho; Ce Guo; Hongxiang Fan; Wayne Luk

arXiv:2502.05850·cs.AR·February 11, 2026

MetaML-Pro: Cross-Stage Design Flow Automation for Efficient Deep Learning Acceleration

Zhiqiang Que, Jose G. F. Coutinho, Ce Guo, Hongxiang Fan, Wayne Luk

PDF

TL;DR

MetaML-Pro is a framework that automates the optimization and deployment of deep neural networks on resource-limited hardware like FPGAs, significantly reducing manual effort and design time while maintaining high accuracy.

Contribution

It introduces a unified, automated approach combining programmatic DNN optimization with high-level synthesis and advanced search strategies for efficient hardware deployment.

Findings

01

Up to 92% DSP reduction while maintaining accuracy

02

Up to 89% LUT reduction with preserved accuracy

03

15.6-fold faster optimization compared to grid search

Abstract

This paper presents a unified framework for codifying and automating optimization strategies to efficiently deploy deep neural networks (DNNs) on resource-constrained hardware, such as FPGAs, while maintaining high performance, accuracy, and resource efficiency. Deploying DNNs on such platforms involves addressing the significant challenge of balancing performance, resource usage (e.g., DSPs and LUTs), and inference accuracy, which often requires extensive manual effort and domain expertise. Our novel approach addresses two core key issues: (i)~encoding custom optimization strategies and (ii)~enabling cross-stage optimization search. In particular, our proposed framework seamlessly integrates programmatic DNN optimization techniques with high-level synthesis (HLS)-based metaprogramming, leveraging advanced design space exploration (DSE) strategies like Bayesian optimization to automate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.