Schema matching using Gaussian mixture models with Wasserstein distance

Mateusz Przyborowski; Mateusz Pabi\'s; Andrzej Janusz; Dominik; \'Sl\k{e}zak

arXiv:2111.14244·cs.LG·April 1, 2022

Schema matching using Gaussian mixture models with Wasserstein distance

Mateusz Przyborowski, Mateusz Pabi\'s, Andrzej Janusz, Dominik, \'Sl\k{e}zak

PDF

Open Access

TL;DR

This paper proposes an approximation method for computing the Wasserstein distance between Gaussian mixture models to improve schema matching, demonstrated with real-world data applications.

Contribution

It introduces a novel approximation of Wasserstein distance for Gaussian mixture models, simplifying calculations for schema matching tasks.

Findings

01

Effective approximation reduces Wasserstein distance to a linear problem

02

Application examples demonstrate practical utility on real-world data

03

Improved schema matching accuracy using the proposed method

Abstract

Gaussian mixture models find their place as a powerful tool, mostly in the clustering problem, but with proper preparation also in feature extraction, pattern recognition, image segmentation and in general machine learning. When faced with the problem of schema matching, different mixture models computed on different pieces of data can maintain crucial information about the structure of the dataset. In order to measure or compare results from mixture models, the Wasserstein distance can be very useful, however it is not easy to calculate for mixture distributions. In this paper we derive one of possible approximations for the Wasserstein distance between Gaussian mixture models and reduce it to linear problem. Furthermore, application examples concerning real world data are shown.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods