# Cross-modal Subspace Learning via Kernel Correlation Maximization and   Discriminative Structure Preserving

**Authors:** Jun Yu, Xiao-Jun Wu

arXiv: 1904.00776 · 2020-01-08

## TL;DR

This paper introduces a novel cross-modal subspace learning framework that maximizes kernel correlation and preserves semantic structure, improving the alignment of heterogeneous data modalities.

## Contribution

It proposes a new framework combining kernel correlation maximization with semantic structure preservation using a shared semantic graph and HSIC.

## Key findings

- Outperforms classic subspace learning methods on three datasets
- Effectively preserves semantic neighbor relationships within modalities
- Ensures inter-modality correlation and intra-modality structure are maintained

## Abstract

The measure between heterogeneous data is still an open problem. Many research works have been developed to learn a common subspace where the similarity between different modalities can be calculated directly. However, most of existing works focus on learning a latent subspace but the semantically structural information is not well preserved. Thus, these approaches cannot get desired results. In this paper, we propose a novel framework, termed Cross-modal subspace learning via Kernel correlation maximization and Discriminative structure-preserving (CKD), to solve this problem in two aspects. Firstly, we construct a shared semantic graph to make each modality data preserve the neighbor relationship semantically. Secondly, we introduce the Hilbert-Schmidt Independence Criteria (HSIC) to ensure the consistency between feature-similarity and semantic-similarity of samples. Our model not only considers the inter-modality correlation by maximizing the kernel correlation but also preserves the semantically structural information within each modality. The extensive experiments are performed to evaluate the proposed framework on the three public datasets. The experimental results demonstrated that the proposed CKD is competitive compared with the classic subspace learning methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.00776/full.md

## Figures

24 figures with captions in the complete paper: https://tomesphere.com/paper/1904.00776/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/1904.00776/full.md

---
Source: https://tomesphere.com/paper/1904.00776