Gaussian Database Alignment and Gaussian Planted Matching
Osman Emre Dai, Daniel Cullina, Negar Kiyavash

TL;DR
This paper explores the connection between database alignment and planted matching using Gaussian features, establishing performance thresholds and analyzing linear programming relaxations for these problems.
Contribution
It derives thresholds for database alignment and planted matching with Gaussian features, and analyzes LP relaxations to understand their effectiveness.
Findings
Alignment thresholds match for both problems when feature dimension is high.
LP relaxations achieve near-optimal alignment thresholds.
Gaps exist between exact and relaxed algorithm thresholds.
Abstract
Database alignment is a variant of the graph alignment problem: Given a pair of anonymized databases containing separate yet correlated features for a set of users, the problem is to identify the correspondence between the features and align the anonymized user sets based on correlation alone. This closely relates to planted matching, where given a bigraph with random weights, the goal is to identify the underlying matching that generated the given weights. We study an instance of the database alignment problem with multivariate Gaussian features and derive results that apply both for database alignment and for planted matching, demonstrating the connection between them. The performance thresholds for database alignment converge to that for planted matching when the dimensionality of the database features is \(\omega(\log n)\), where \(n\) is the size of the alignment, and no individual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Time Series Analysis and Forecasting · Advanced Database Systems and Queries
MethodsALIGN
