A Theoretical Framework for Modular Learning of Robust Generative Models
Corinna Cortes, Mehryar Mohri, Yutao Zhong

TL;DR
This paper introduces a theoretical framework for modular generative models that combines pre-trained experts via a robust gating mechanism, enabling scalable, robust, and potentially superior performance over monolithic models.
Contribution
It formulates a minimax game for robust gating, proves existence of such gates, and demonstrates modularity as a regularizer with generalization bounds, along with scalable algorithms.
Findings
Modular models can outperform monolithic models in certain settings.
Theoretical guarantees for the existence of robust gating functions.
Empirical results show effective mitigation of gradient conflict.
Abstract
Training large-scale generative models is resource-intensive and relies heavily on heuristic dataset weighting. We address two fundamental questions: Can we train Large Language Models (LLMs) modularly-combining small, domain-specific experts to match monolithic performance-and can we do so robustly for any data mixture, eliminating heuristic tuning? We present a theoretical framework for modular generative modeling where a set of pre-trained experts are combined via a gating mechanism. We define the space of normalized gating functions, , and formulate the problem as a minimax game to find a single robust gate that minimizes divergence to the worst-case data mixture. We prove the existence of such a robust gate using Kakutani's fixed-point theorem and show that modularity acts as a strong regularizer, with generalization bounds scaling with the lightweight gate's complexity.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning and Algorithms · Machine Learning and Data Classification
