FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models

Juyong Jiang; Fan Wang; Hong Qi; Sunghun Kim; Jing Tang

arXiv:2604.01762·cs.LG·April 3, 2026

FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models

Juyong Jiang, Fan Wang, Hong Qi, Sunghun Kim, Jing Tang

PDF

TL;DR

FourierMoE introduces a spectral domain mixture-of-experts approach for large language model fine-tuning, improving task performance and parameter efficiency by leveraging frequency-aware adaptation.

Contribution

It reformulates MoE adaptation in the spectral domain using IDFT, enabling frequency-specific expert routing and lossless real-valued weight reconstruction.

Findings

01

Outperforms baselines across 28 benchmarks in various scales.

02

Achieves better multi-task and single-task performance with fewer parameters.

03

Spectral domain adaptation reveals task-specific frequency energy distributions.

Abstract

Parameter-efficient fine-tuning (PEFT) has emerged as a crucial paradigm for adapting large language models (LLMs) under constrained computational budgets. However, standard PEFT methods often struggle in multi-task fine-tuning settings, where diverse optimization objectives induce task interference and limited parameter budgets lead to representational deficiency. While recent approaches incorporate mixture-of-experts (MoE) to alleviate these issues, they predominantly operate in the spatial domain, which may introduce structural redundancy and parameter overhead. To overcome these limitations, we reformulate adaptation in the spectral domain. Our spectral analysis reveals that different tasks exhibit distinct frequency energy distributions, and that LLM layers display heterogeneous frequency sensitivities. Motivated by these insights, we propose FourierMoE, which integrates the MoE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.