balance -- a Python package for balancing biased data samples

Tal Sarig; Tal Galili; Roee Eilat

arXiv:2307.06024·stat.CO·July 14, 2023·1 cites

balance -- a Python package for balancing biased data samples

Tal Sarig, Tal Galili, Roee Eilat

PDF

Open Access 1 Repo

TL;DR

The paper introduces 'balance', an open-source Python package that helps researchers analyze and correct bias in survey data samples to improve the accuracy of insights and machine learning models.

Contribution

It presents a new Python package with a straightforward workflow for bias analysis and adjustment in survey data, including bias understanding, correction, and evaluation.

Findings

01

Provides a simple API for bias correction

02

Includes methods for bias assessment and adjustment

03

Enhances the reliability of survey-based insights

Abstract

Surveys are an important research tool, providing unique measurements on subjective experiences such as sentiment and opinions that cannot be measured by other means. However, because survey data is collected from a self-selected group of participants, directly inferring insights from it to a population of interest, or training ML models on such data, can lead to erroneous estimates or under-performing models. In this paper we present balance, an open-source Python package by Meta, offering a simple workflow for analyzing and adjusting biased data samples with respect to a population of interest. The balance workflow includes three steps: understanding the initial bias in the data relative to a target we would like to infer, adjusting the data to correct for the bias by producing weights for each unit in the sample based on propensity scores, and evaluating the final biases and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/balance
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications · Computational and Text Analysis Methods · Data Analysis with R