POLARON: Precision-aware On-device Learning and Adaptive Runtime-cONfigurable AI acceleration

Mukul Lokhande; and Santosh Kumar Vishvakarma

arXiv:2506.08785·cs.AR·June 11, 2025

POLARON: Precision-aware On-device Learning and Adaptive Runtime-cONfigurable AI acceleration

Mukul Lokhande, and Santosh Kumar Vishvakarma

PDF

Open Access

TL;DR

POLARON introduces PARV-CE, a flexible, multi-precision AI accelerator that adapts precision dynamically to optimize energy efficiency and performance for diverse edge AI workloads.

Contribution

This work presents PARV-CE, a novel SIMD-enabled multi-precision MAC engine with adaptive precision strategies, enabling efficient on-device AI training and inference across various models.

Findings

01

Up to 2x reduction in PDP compared to state-of-the-art designs

02

3x decrease in resource usage while maintaining accuracy within 1.8% of FP32 baseline

03

Supports diverse workloads including DNNs, RNNs, RL, and Transformers

Abstract

The increasing complexity of AI models requires flexible hardware capable of supporting diverse precision formats, particularly for energy-constrained edge platforms. This work presents PARV-CE, a SIMD-enabled, multi-precision MAC engine that performs efficient multiply-accumulate operations using a unified data-path for 4/8/16-bit fixed-point, floating point, and posit formats. The architecture incorporates a layer adaptive precision strategy to align computational accuracy with workload sensitivity, optimizing both performance and energy usage. PARV-CE integrates quantization-aware execution with a reconfigurable SIMD pipeline, enabling high-throughput processing with minimal overhead through hardware-software co-design. The results demonstrate up to 2x improvement in PDP and 3x reduction in resource usage compared to SoTA designs, while retaining accuracy within 1.8% FP32 baseline.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmbedded Systems Design Techniques · Parallel Computing and Optimization Techniques · Advanced Neural Network Applications

MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · ALIGN