MambaFormer: Token-Level Guided Routing Mixture-of-Experts for Accurate and Efficient Clinical Assistance
Hamad Khan, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering, Applied Sciences (UEAS), Swat 19060, Pakistan)

TL;DR
MambaFormer is a hybrid Mixture-of-Experts framework that dynamically routes tokens to specialized experts for efficient and accurate clinical question-answering, balancing computational cost and prediction performance.
Contribution
It introduces token-level guided routing with a utility-driven multi-objective loss for scalable, resource-efficient clinical language models.
Findings
Outperforms state-of-the-art methods on DentalQA and PubMedQA datasets.
Achieves 24.4x speedup over T5-Large with high accuracy.
Enforces Pareto-optimal trade-off between latency and accuracy.
Abstract
The deployment of large language models (LLMs) in real-world clinical applications is constrained by the fundamental trade-off between computational cost and the efficiency of linear-time models. To address this, we propose an LLM-based MambaFormer hybrid Mixture-of-Experts (MoE) framework for efficient medical question-answering (QA) and clinical assistance. The MambaFormer employs a lightweight gating mechanism that performs token-level dynamic routing to a customized Transformer expert (ET5) for short, complex queries or to a State Space Model expert (EMamba) for long, high-throughput sequences. The customized EMamba and ET5 models are tailored to accommodate input sequence dimensionality, embedding structure, sequence length, and target-specific output heads, and are fine-tuned through transfer learning on a new, custom-designed DentalQA dataset. Moreover, intelligent routing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare and Education
