FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models
Juyong Jiang, Fan Wang, Hong Qi, Sunghun Kim, Jing Tang

TL;DR
FourierMoE introduces a spectral domain mixture-of-experts approach for large language model fine-tuning, improving task performance and parameter efficiency by leveraging frequency-aware adaptation.
Contribution
It reformulates MoE adaptation in the spectral domain using IDFT, enabling frequency-specific expert routing and lossless real-valued weight reconstruction.
Findings
Outperforms baselines across 28 benchmarks in various scales.
Achieves better multi-task and single-task performance with fewer parameters.
Spectral domain adaptation reveals task-specific frequency energy distributions.
Abstract
Parameter-efficient fine-tuning (PEFT) has emerged as a crucial paradigm for adapting large language models (LLMs) under constrained computational budgets. However, standard PEFT methods often struggle in multi-task fine-tuning settings, where diverse optimization objectives induce task interference and limited parameter budgets lead to representational deficiency. While recent approaches incorporate mixture-of-experts (MoE) to alleviate these issues, they predominantly operate in the spatial domain, which may introduce structural redundancy and parameter overhead. To overcome these limitations, we reformulate adaptation in the spectral domain. Our spectral analysis reveals that different tasks exhibit distinct frequency energy distributions, and that LLM layers display heterogeneous frequency sensitivities. Motivated by these insights, we propose FourierMoE, which integrates the MoE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
