A Theoretical Framework for Modular Learning of Robust Generative Models

Corinna Cortes; Mehryar Mohri; Yutao Zhong

arXiv:2602.17554·cs.LG·February 25, 2026

A Theoretical Framework for Modular Learning of Robust Generative Models

Corinna Cortes, Mehryar Mohri, Yutao Zhong

PDF

Open Access

TL;DR

This paper introduces a theoretical framework for modular generative models that combines pre-trained experts via a robust gating mechanism, enabling scalable, robust, and potentially superior performance over monolithic models.

Contribution

It formulates a minimax game for robust gating, proves existence of such gates, and demonstrates modularity as a regularizer with generalization bounds, along with scalable algorithms.

Findings

01

Modular models can outperform monolithic models in certain settings.

02

Theoretical guarantees for the existence of robust gating functions.

03

Empirical results show effective mitigation of gradient conflict.

Abstract

Training large-scale generative models is resource-intensive and relies heavily on heuristic dataset weighting. We address two fundamental questions: Can we train Large Language Models (LLMs) modularly-combining small, domain-specific experts to match monolithic performance-and can we do so robustly for any data mixture, eliminating heuristic tuning? We present a theoretical framework for modular generative modeling where a set of pre-trained experts are combined via a gating mechanism. We define the space of normalized gating functions, $G_{1}$ , and formulate the problem as a minimax game to find a single robust gate that minimizes divergence to the worst-case data mixture. We prove the existence of such a robust gate using Kakutani's fixed-point theorem and show that modularity acts as a strong regularizer, with generalization bounds scaling with the lightweight gate's complexity.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning and Algorithms · Machine Learning and Data Classification