AxMoE: Characterizing the Impact of Approximate Multipliers on Mixture-of-Experts DNN Architectures

Omkar B Shende; Marcello Traiola; Gayathri Ananthanarayanan

arXiv:2605.04754·cs.LG·May 7, 2026

AxMoE: Characterizing the Impact of Approximate Multipliers on Mixture-of-Experts DNN Architectures

Omkar B Shende, Marcello Traiola, Gayathri Ananthanarayanan

PDF

TL;DR

This study investigates how approximate multipliers affect the performance of various Mixture-of-Experts DNN architectures, revealing architecture-dependent resilience and recovery capabilities after retraining.

Contribution

First comprehensive analysis of approximate multiplication impacts on MoE DNNs, highlighting architecture-specific behaviors and recovery potential without prior work in this area.

Findings

01

Dense baseline is most resilient without retraining.

02

VGG architectures recover at moderate multipliers but not at aggressive ones.

03

Hard MoE outperforms Dense on ViT-Small under aggressive approximation after retraining.

Abstract

Deep neural network (DNN) inference at the edge demands simultaneous improvements in accuracy, computational efficiency, and energy consumption. Approximate computing and Mixture-of-Experts (MoE) architectures have each been studied as independent routes towards efficient inference, the former by replacing exact arithmetic with low-power approximate multipliers, the latter by routing inputs through specialized expert sub-networks to enable conditional computation. However, their interaction remains entirely unexplored. This paper presents AxMoE, the first study of the impact of approximate multiplication on MoE DNN architectures. We evaluate three MoE variants: Hard MoE, Soft MoE, and Cluster MoE against dense baselines across three CNN architectures (ResNet-20, VGG11_bn, VGG19_bn) on CIFAR-100 and a Vision Transformer (ViT-Small) on Tiny ImageNet-200 dataset, using eight 8-bit signed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.