# Matching on What Matters: A Pseudo-Metric Learning Approach to Matching   Estimation in High Dimensions

**Authors:** Gentry Johnson, Brian Quistorff, Matt Goldman

arXiv: 1905.12020 · 2019-05-30

## TL;DR

This paper introduces neural network-based methods for high-dimensional matching estimation that improve over existing metric learning approaches by providing better theoretical guarantees and empirical performance.

## Contribution

It proposes novel neural network techniques, including MLPs and siamese networks, for learning latent features to enhance matching in high-dimensional data with theoretical and empirical validation.

## Key findings

- Neural network methods outperform existing metric learning approaches.
- The proposed methods achieve more accurate matching in simulations.
- Superior performance demonstrated on real-world datasets, including NSW.

## Abstract

When pre-processing observational data via matching, we seek to approximate each unit with maximally similar peers that had an alternative treatment status--essentially replicating a randomized block design. However, as one considers a growing number of continuous features, a curse of dimensionality applies making asymptotically valid inference impossible (Abadie and Imbens, 2006). The alternative of ignoring plausibly relevant features is certainly no better, and the resulting trade-off substantially limits the application of matching methods to "wide" datasets. Instead, Li and Fu (2017) recasts the problem of matching in a metric learning framework that maps features to a low-dimensional space that facilitates "closer matches" while still capturing important aspects of unit-level heterogeneity. However, that method lacks key theoretical guarantees and can produce inconsistent estimates in cases of heterogeneous treatment effects. Motivated by straightforward extension of existing results in the matching literature, we present alternative techniques that learn latent matching features through either MLPs or through siamese neural networks trained on a carefully selected loss function. We benchmark the resulting alternative methods in simulations as well as against two experimental data sets--including the canonical NSW worker training program data set--and find superior performance of the neural-net-based methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.12020/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1905.12020/full.md

---
Source: https://tomesphere.com/paper/1905.12020