Group-matching algorithms for subjects and items

G\'eza Kiss; Kyle Gorman; Jan P.H. van Santen

arXiv:2110.04432·stat.ME·October 12, 2021

Group-matching algorithms for subjects and items

G\'eza Kiss, Kyle Gorman, Jan P.H. van Santen

PDF

Open Access

TL;DR

This paper addresses the complex problem of creating statistically similar matched groups from existing samples for observational studies, proposing heuristics that perform well despite the NP-hard nature of the problem.

Contribution

It introduces heuristic algorithms implemented in the ldamatch package for effective group-matching in scenarios where traditional methods are unsuitable.

Findings

01

Heuristics produce high-quality matches on real-world data

02

The group-matching problem is NP-hard, requiring approximate solutions

03

The proposed methods outperform some existing approaches

Abstract

We consider the problem of constructing matched groups such that the resulting groups are statistically similar with respect to their average values for multiple covariates. This group-matching problem arises in many cases, including quasi-experimental and observational studies in which subjects or items are sampled from pre-existing groups, scenarios in which traditional pair-matching approaches may be inappropriate. We consider the case in which one is provided with an existing sample and iteratively eliminates samples so that the groups "match" according to arbitrary statistically-defined criteria. This problem is NP-hard. However, using artificial and real-world data sets, we show that heuristics implemented by the ldamatch package produce high-quality matches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Bayesian Modeling and Causal Inference