Mapping Networks
Lord Sen, Shyamapada Mukherjee

TL;DR
Mapping Networks propose a low-dimensional latent space approach to replace high-dimensional weights in deep learning, reducing overfitting and maintaining or improving performance across various tasks with significantly fewer parameters.
Contribution
The paper introduces Mapping Networks, a novel method that maps a low-dimensional latent space to the weight space, supported by a theoretical Mapping Theorem, to improve efficiency and reduce overfitting.
Findings
Achieves around 500x reduction in trainable parameters.
Maintains or surpasses target network performance on vision and sequence tasks.
Effectively reduces overfitting in large-scale models.
Abstract
The escalating parameter counts in modern deep learning models pose a fundamental challenge to efficient training and resolution of overfitting. We address this by introducing the \emph{Mapping Networks} which replace the high dimensional weight space by a compact, trainable latent vector based on the hypothesis that the trained parameters of large networks reside on smooth, low-dimensional manifolds. Henceforth, the Mapping Theorem enforced by a dedicated Mapping Loss, shows the existence of a mapping from this latent space to the target weight space both theoretically and in practice. Mapping Networks significantly reduce overfitting and achieve comparable to better performance than target network across complex vision and sequence tasks, including Image Classification, Deepfake Detection etc, with , i.e., around reduction in trainable parameters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Face recognition and analysis
