# Generalized Dirichlet-process-means for $f$-separable distortion   measures

**Authors:** Masahiro Kobayashi, Kazuho Watanabe

arXiv: 1901.11331 · 2021-08-26

## TL;DR

This paper extends DP-means clustering to $f$-separable distortion measures, providing a robust, unified algorithm that adapts to different functions and improves outlier resistance, validated through experiments.

## Contribution

It introduces a generalized framework for DP-means using $f$-separable measures and analyzes robustness via influence functions, enhancing outlier handling.

## Key findings

- Improved robustness against outliers demonstrated in experiments.
- Unified algorithm effectively adapts to various $f$-separable measures.
- Numerical results show competitive clustering performance.

## Abstract

DP-means clustering was obtained as an extension of $K$-means clustering. While it is implemented with a simple and efficient algorithm, it can estimate the number of clusters simultaneously. However, DP-means is specifically designed for the average distortion measure. Therefore, it is vulnerable to outliers in data, and can cause large maximum distortion in clusters. In this work, we extend the objective function of the DP-means to $f$-separable distortion measures and propose a unified learning algorithm to overcome the above problems by selecting the function $f$. Further, the influence function of the estimated cluster center is analyzed to evaluate the robustness against outliers. We demonstrate the performance of the generalized method by numerical experiments using real datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.11331/full.md

## Figures

114 figures with captions in the complete paper: https://tomesphere.com/paper/1901.11331/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/1901.11331/full.md

---
Source: https://tomesphere.com/paper/1901.11331