# Large Margin Multi-modal Multi-task Feature Extraction for Image   Classification

**Authors:** Yong Luo, Yonggang Wen, Dacheng Tao, Jie Gui, Chao Xu

arXiv: 1904.04088 · 2019-04-09

## TL;DR

This paper introduces LM3FE, a novel multi-modal multi-task feature extraction framework that improves image classification by leveraging multiple feature modalities and the large margin principle to enhance predictive power.

## Contribution

The paper proposes a new large margin multi-modal multi-task feature extraction method that jointly learns modality-specific features and their combination, addressing multi-modal data and feature redundancy.

## Key findings

- Outperforms existing methods on real-world datasets
- Effectively handles noisy and correlated features
- Utilizes modality complementarity for better classification

## Abstract

The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, which often outperform single-task feature extraction approaches. However, most of these methods are limited in that they only consider data represented by a single type of feature, even though features usually represent images from multiple modalities. We therefore propose a novel large margin multi-modal multi-task feature extraction (LM3FE) framework for handling multi-modal features for image classification. In particular, LM3FE simultaneously learns the feature extraction matrix for each modality and the modality combination coefficients. In this way, LM3FE not only handles correlated and noisy features, but also utilizes the complementarity of different modalities to further help reduce feature redundancy in each modality. The large margin principle employed also helps to extract strongly predictive features so that they are more suitable for prediction (e.g., classification). An alternating algorithm is developed for problem optimization and each sub-problem can be efficiently solved. Experiments on two challenging real-world image datasets demonstrate the effectiveness and superiority of the proposed method.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.04088/full.md

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/1904.04088/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1904.04088/full.md

---
Source: https://tomesphere.com/paper/1904.04088