Fast semi-supervised discriminant analysis for binary classification of   large data-sets

Joris Tavernier; Jaak Simm; Karl Meerbergen; Joerg Kurt Wegner; Hugo; Ceulemans; Yves Moreau

arXiv:1709.04794·cs.AI·February 21, 2019

Fast semi-supervised discriminant analysis for binary classification of large data-sets

Joris Tavernier, Jaak Simm, Karl Meerbergen, Joerg Kurt Wegner, Hugo, Ceulemans, Yves Moreau

PDF

TL;DR

This paper introduces three scalable Krylov subspace-based algorithms for semi-supervised discriminant analysis, significantly reducing computation time while maintaining good predictive performance on large, high-dimensional datasets.

Contribution

The paper presents novel scalable algorithms for semi-supervised discriminant analysis that leverage Krylov subspace methods and data centralization, improving efficiency for large datasets.

Findings

01

Achieves good predictive performance on industry-scale pharmaceutical data

02

Methods require only a few seconds to compute, outperforming previous approaches

03

Effectively exploits data sparsity and shift-invariance of Krylov subspaces

Abstract

High-dimensional data requires scalable algorithms. We propose and analyze three scalable and related algorithms for semi-supervised discriminant analysis (SDA). These methods are based on Krylov subspace methods which exploit the data sparsity and the shift-invariance of Krylov subspaces. In addition, the problem definition was improved by adding centralization to the semi-supervised setting. The proposed methods are evaluated on a industry-scale data set from a pharmaceutical company to predict compound activity on target proteins. The results show that SDA achieves good predictive performance and our methods only require a few seconds, significantly improving computation time on previous state of the art.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.