TL;DR
Deep-ICE is a novel algorithm that guarantees globally optimal solutions for minimizing misclassifications in two-layer ReLU and maxout networks, with extensions for large datasets and improved accuracy.
Contribution
It introduces the first globally optimal algorithm for two-layer maxout and ReLU networks and a coreset method for large-scale data processing.
Findings
Provides exact solutions for small datasets
Reduces data size for large datasets with coreset method
Achieves 20-30% fewer misclassifications compared to state-of-the-art methods
Abstract
This paper introduces the first globally optimal algorithm for the empirical risk minimization problem of two-layer maxout and ReLU networks, i.e., minimizing the number of misclassifications. The algorithm has a worst-case time complexity of , where denotes the number of hidden neurons and represents the number of features. It can be can be generalized to accommodate arbitrary computable loss functions without affecting its computational complexity. Our experiments demonstrate that the proposed algorithm provides provably exact solutions for small-scale datasets. To handle larger datasets, we introduce a novel coreset selection method that reduces the data size to a manageable scale, making it feasible for our algorithm. This extension enables efficient processing of large-scale datasets and achieves significantly improved performance, with a 20-30\%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
