# Integrating Tensor Similarity to Enhance Clustering Performance

**Authors:** Hong Peng, Yu Hu, Jiazhou Chen, Haiyan Wang, Yang Li, and Hongmin Cai

arXiv: 1905.03920 · 2020-06-29

## TL;DR

This paper introduces a tensor-based high-order similarity measure to improve clustering accuracy, especially in noisy or imbalanced datasets, by complementing traditional pairwise similarities.

## Contribution

It proposes a novel tensor similarity approach that enhances clustering robustness by capturing spatial information and integrating it with existing pairwise similarities.

## Key findings

- IPS2 outperforms previous methods on real-world datasets.
- The high-order similarity improves robustness to noise and sampling imbalance.
- The method effectively handles under-sampled and noisy data.

## Abstract

The performance of most the clustering methods hinges on the used pairwise affinity, which is usually denoted by a similarity matrix. However, the pairwise similarity is notoriously known for its vulnerability of noise contamination or the imbalance in samples or features, and thus hinders accurate clustering. To tackle this issue, we propose to use information among samples to boost the clustering performance. We proved that a simplified similarity for pairs, denoted by a fourth order tensor, equals to the Kronecker product of pairwise similarity matrices under decomposable assumption, or provide complementary information for which the pairwise similarity missed under indecomposable assumption. Then a high order similarity matrix is obtained from the tensor similarity via eigenvalue decomposition. The high order similarity capturing spatial information serves as a robust complement for the pairwise similarity. It is further integrated with the popular pairwise similarity, named by IPS2, to boost the clustering performance. Extensive experiments demonstrated that the proposed IPS2 significantly outperformed previous similarity-based methods on real-world datasets and it was capable of handling the clustering task over under-sampled and noisy datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.03920/full.md

## Figures

27 figures with captions in the complete paper: https://tomesphere.com/paper/1905.03920/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1905.03920/full.md

---
Source: https://tomesphere.com/paper/1905.03920