A Parameterizable Convolution Accelerator for Embedded Deep Learning Applications

Panagiotis Mousouliotis; Georgios Keramidas

arXiv:2602.04044·cs.CV·February 5, 2026

A Parameterizable Convolution Accelerator for Embedded Deep Learning Applications

Panagiotis Mousouliotis, Georgios Keramidas

PDF

Open Access

TL;DR

This paper introduces a parameterizable CNN accelerator for embedded FPGA applications, optimized through HW/SW co-design and high-level synthesis to balance performance, latency, power, and area constraints.

Contribution

It presents a flexible, high-level synthesis-based design methodology for FPGA CNN accelerators that effectively manages multiple embedded application constraints.

Findings

01

Outperforms non-parameterized designs in efficiency.

02

Easily extendable to other deep learning applications.

03

Demonstrates effective optimization across multiple constraints.

Abstract

Convolutional neural network (CNN) accelerators implemented on Field-Programmable Gate Arrays (FPGAs) are typically designed with a primary focus on maximizing performance, often measured in giga-operations per second (GOPS). However, real-life embedded deep learning (DL) applications impose multiple constraints related to latency, power consumption, area, and cost. This work presents a hardware-software (HW/SW) co-design methodology in which a CNN accelerator is described using high-level synthesis (HLS) tools that ease the parameterization of the design, facilitating more effective optimizations across multiple design constraints. Our experimental results demonstrate that the proposed design methodology is able to outperform non-parameterized design approaches, and it can be easily extended to other types of DL applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Embedded Systems Design Techniques · Numerical Methods and Algorithms