Model Parallel Training and Transfer Learning for Convolutional Neural Networks by Domain Decomposition
Axel Klawonn, Martin Lanser, Janine Weber

TL;DR
This paper explores a domain decomposition-based model parallel CNN architecture that trains local CNNs on subimages and combines their outputs with a DNN, comparing different integration strategies and transfer learning approaches for improved efficiency.
Contribution
It introduces a novel CNN-DNN architecture based on domain decomposition and evaluates transfer learning strategies for training efficiency and performance.
Findings
The CNN-DNN architecture effectively combines local classifications into a global decision.
Transfer learning improves training efficiency and model performance.
Comparison shows trade-offs between different combination strategies.
Abstract
Deep convolutional neural networks (CNNs) have been shown to be very successful in a wide range of image processing applications. However, due to their increasing number of model parameters and an increasing availability of large amounts of training data, parallelization strategies to efficiently train complex CNNs are necessary. In previous work by the authors, a novel model parallel CNN architecture was proposed which is loosely inspired by domain decomposition. In particular, the novel network architecture is based on a decomposition of the input data into smaller subimages. For each of these subimages, local CNNs with a proportionally smaller number of parameters are trained in parallel and the resulting local classifications are then aggregated in a second step by a dense feedforward neural network (DNN). In the present work, we compare the resulting CNN-DNN architecture to less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning
