Mixture of Raytraced Experts

Andrea Perin; Giacomo Lagomarsini; Claudio Gallicchio; Giuseppe Nuti

arXiv:2507.12419·cs.LG·July 17, 2025

Mixture of Raytraced Experts

Andrea Perin, Giacomo Lagomarsini, Claudio Gallicchio, Giuseppe Nuti

PDF

Open Access

TL;DR

This paper introduces a dynamic Mixture of Raytraced Experts architecture that adaptively selects expert sequences, improving accuracy and training efficiency without load-balancing, and enabling more flexible and expressive models.

Contribution

It presents a novel MoE architecture that dynamically sequences experts, reducing training epochs and increasing model flexibility without load-balancing mechanisms.

Findings

01

Training epochs reduced by 10-40%

02

Achieved comparable or higher accuracy

03

Enables variable-width and depth computation graphs

Abstract

We introduce a Mixture of Raytraced Experts, a stacked Mixture of Experts (MoE) architecture which can dynamically select sequences of experts, producing computational graphs of variable width and depth. Existing MoE architectures generally require a fixed amount of computation for a given sample. Our approach, in contrast, yields predictions with increasing accuracy as the computation cycles through the experts' sequence. We train our model by iteratively sampling from a set of candidate experts, unfolding the sequence akin to how Recurrent Neural Networks are trained. Our method does not require load-balancing mechanisms, and preliminary experiments show a reduction in training epochs of 10\% to 40\% with a comparable/higher accuracy. These results point to new research directions in the field of MoEs, allowing the design of potentially faster and more expressive models. The code is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConferences and Exhibitions Management