On the Furthest Hyperplane Problem and Maximal Margin Clustering
Zohar Karnin, Edo Liberty, Shachar Lovett, Roy Schwartz, Omri, Weinstein

TL;DR
This paper introduces the Furthest Hyperplane Problem (FHP), an unsupervised variant of SVMs focused on maximizing the margin, and provides the first provable bounds, algorithms, and complexity results for this NP-hard problem.
Contribution
It presents the first theoretical analysis of FHP, including bounds, approximation algorithms, and complexity results, establishing its computational difficulty.
Findings
A randomized algorithm with n^{O(1/θ^2)} time complexity.
An approximation algorithm achieving near-optimal margin for most points.
FHP does not admit a PTAS, proven via a gap-preserving reduction.
Abstract
This paper introduces the Furthest Hyperplane Problem (FHP), which is an unsupervised counterpart of Support Vector Machines. Given a set of n points in Rd, the objective is to produce the hyperplane (passing through the origin) which maximizes the separation margin, that is, the minimal distance between the hyperplane and any input point. To the best of our knowledge, this is the first paper achieving provable results regarding FHP. We provide both lower and upper bounds to this NP-hard problem. First, we give a simple randomized algorithm whose running time is n^O(1/{\theta}^2) where {\theta} is the optimal separation margin. We show that its exponential dependency on 1/{\theta}^2 is tight, up to sub-polynomial factors, assuming SAT cannot be solved in sub-exponential time. Next, we give an efficient approxima- tion algorithm. For any {\alpha} \in [0, 1], the algorithm produces a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Optimization and Search Problems · Imbalanced Data Classification Techniques
