Parity Queries for Binary Classification
Hye Won Chung, Ji Oon Lee, Doyeon Kim, Alfred O. Hero

TL;DR
This paper investigates the optimal number of parity-based measurements needed to recover binary variables efficiently, revealing fundamental trade-offs between measurement complexity, query difficulty, and recovery accuracy.
Contribution
It introduces a method for designing queries that achieve near-optimal recovery of binary variables with minimal measurements, establishing key theoretical bounds.
Findings
Sample complexity scales as max{k, (k log k)/d̄} for full recovery.
Sample complexity scales as max{k, (k log(1/δ))/d̄} for partial recovery.
Fundamental trade-offs between recovery accuracy, query difficulty, and number of measurements.
Abstract
Consider a query-based data acquisition problem that aims to recover the values of binary variables from parity (XOR) measurements of chosen subsets of the variables. Assume the response model where only a randomly selected subset of the measurements is received. We propose a method for designing a sequence of queries so that the variables can be identified with high probability using as few () measurements as possible. We define the query difficulty as the average size of the query subsets and the sample complexity as the minimum number of measurements required to attain a given recovery accuracy. We obtain fundamental trade-offs between recovery accuracy, query difficulty, and sample complexity. In particular, the necessary and sufficient sample complexity required for recovering all variables with high probability is …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Bayesian Modeling and Causal Inference
