# Large-Margin Multiple Kernel Learning for Discriminative Features   Selection and Representation Learning

**Authors:** Babak Hosseini, Barbara Hammer

arXiv: 1903.03364 · 2019-03-14

## TL;DR

This paper introduces a multi-class large-margin MKL framework that enhances class separation, performs discriminative feature selection with sparsity, and achieves competitive accuracy and interpretability in real-world datasets.

## Contribution

It proposes a novel multi-class MKL method with large-margin optimization and sparsity for improved class separation and feature selection, advancing beyond binary and linear assumptions.

## Key findings

- Achieves competitive classification accuracy on real-world datasets.
- Learns sparse kernel weights for interpretable feature selection.
- Enhances local class separation in the feature space.

## Abstract

Multiple kernel learning (MKL) algorithms combine different base kernels to obtain a more efficient representation in the feature space. Focusing on discriminative tasks, MKL has been used successfully for feature selection and finding the significant modalities of the data. In such applications, each base kernel represents one dimension of the data or is derived from one specific descriptor. Therefore, MKL finds an optimal weighting scheme for the given kernels to increase the classification accuracy. Nevertheless, the majority of the works in this area focus on only binary classification problems or aim for linear separation of the classes in the kernel space, which are not realistic assumptions for many real-world problems. In this paper, we propose a novel multi-class MKL framework which improves the state-of-the-art by enhancing the local separation of the classes in the feature space. Besides, by using a sparsity term, our large-margin multiple kernel algorithm (LMMK) performs discriminative feature selection by aiming to employ a small subset of the base kernels. Based on our empirical evaluations on different real-world datasets, LMMK provides a competitive classification accuracy compared with the state-of-the-art algorithms in MKL. Additionally, it learns a sparse set of non-zero kernel weights which leads to a more interpretable feature selection and representation learning.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.03364/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1903.03364/full.md

## References

55 references — full list in the complete paper: https://tomesphere.com/paper/1903.03364/full.md

---
Source: https://tomesphere.com/paper/1903.03364