TL;DR
This paper presents automated methods for inferring optimal cross-modal CNN topologies from data, reducing design effort and improving accuracy, demonstrated through superior performance over hand-designed architectures.
Contribution
It introduces two data-driven approaches for automatically learning cross-modal CNN topologies, streamlining design and enhancing performance compared to manual methods.
Findings
Achieved up to 9% higher accuracy than hand-designed X-CNNs.
Method reduces design time and complexity significantly.
Fully automated approach implemented in Xsertion library.
Abstract
This paper introduces a way to learn cross-modal convolutional neural network (X-CNN) architectures from a base convolutional network (CNN) and the training data to reduce the design cost and enable applying cross-modal networks in sparse data environments. Two approaches for building X-CNNs are presented. The base approach learns the topology in a data-driven manner, by using measurements performed on the base CNN and supplied data. The iterative approach performs further optimisation of the topology through a combined learning procedure, simultaneously learning the topology and training the network. The approaches were evaluated agains examples of hand-designed X-CNNs and their base variants, showing superior performance and, in some cases, gaining an additional 9% of accuracy. From further considerations, we conclude that the presented methodology takes less time than any manual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
