Dual Convexified Convolutional Neural Networks

Site Bai; Chuyang Ke; Jean Honorio

arXiv:2205.14056·cs.LG·September 17, 2024

Dual Convexified Convolutional Neural Networks

Site Bai, Chuyang Ke, Jean Honorio

PDF

Open Access

TL;DR

This paper introduces dual convexified convolutional neural networks (DCCNNs), a framework that improves computational efficiency and weight recovery in convexified CNNs by leveraging dual optimization and low-rank structures.

Contribution

It presents a novel dual convex training framework for CNNs, along with a weight recovery algorithm that enhances efficiency and reduces parameter size.

Findings

01

Reduces computational overhead in training CNNs.

02

Eliminates ambiguity in kernel matrix factorization.

03

Recovers weights efficiently from dual solutions.

Abstract

We propose the framework of dual convexified convolutional neural networks (DCCNNs). In this framework, we first introduce a primal learning problem motivated by convexified convolutional neural networks (CCNNs), and then construct the dual convex training program through careful analysis of the Karush-Kuhn-Tucker (KKT) conditions and Fenchel conjugates. Our approach reduces the computational overhead of constructing a large kernel matrix and more importantly, eliminates the ambiguity of factorizing the matrix. Due to the low-rank structure in CCNNs and the related subdifferential of nuclear norms, there is no closed-form expression to recover the primal solution from the dual solution. To overcome this, we propose a highly novel weight recovery algorithm, which takes the dual solution and the kernel information as the input, and recovers the linear weight and the output of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Machine Learning and ELM · Stochastic Gradient Optimization Techniques