Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models
Raeid Saqur, Anastasis Kratsios, Florian Krach, Yannick Limmer,, Jacob-Junqi Tian, John Willes, Blanka Horvath, Frank Rudzicz

TL;DR
This paper introduces MoE-F, a stochastic filtering-based online gating mechanism for combining multiple pre-trained LLMs in time-series prediction, achieving significant performance improvements over individual models.
Contribution
It presents a novel filtering-based ensemble method for adaptive expert combination, with theoretical guarantees and empirical validation on financial and long-horizon forecasting tasks.
Findings
Achieves 17% absolute F1 improvement in financial market prediction
Demonstrates substantial gains in long-horizon time-series forecasting
Provides theoretical optimality guarantees for the filtering-based gating algorithm
Abstract
We propose MoE-F - a formalized mechanism for combining pre-trained Large Language Models (LLMs) for online time-series prediction by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its next step. Diverging from static (learned) Mixture of Experts (MoE) methods, our approach employs time-adaptive stochastic filtering techniques to combine experts. By framing the expert selection problem as a finite state-space, continuous-time Hidden Markov model (HMM), we can leverage the Wohman-Shiryaev filter. Our approach first constructs N parallel filters corresponding to each of the individual LLMs. Each filter proposes its best combination of LLMs, given the information that they have access…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Speech and dialogue systems
MethodsMixture of Experts
