# Coding for Crowdsourced Classification with XOR Queries

**Authors:** James Chin-Jen Pang, Hessam Mahdavifar, and S. Sandeep Pradhan

arXiv: 1906.10637 · 2020-02-03

## TL;DR

This paper models crowdsourced classification as a sparse source coding problem using XOR queries, proposing efficient querying schemes that minimize the number of queries and handle unresponsive workers and correlated labels.

## Contribution

It introduces novel XOR-based querying schemes for crowdsourced labeling, extending to unresponsive workers and correlated label systems, achieving near-optimal query efficiency.

## Key findings

- Query schemes with constant label involvement per query
- Logarithmic query complexity for unresponsive workers
- Two-stage schemes for correlated label classification

## Abstract

This paper models the crowdsourced labeling/classification problem as a sparsely encoded source coding problem, where each query answer, regarded as a code bit, is the XOR of a small number of labels, as source information bits. In this paper we leverage the connections between this problem and well-studied codes with sparse representations for the channel coding problem to provide querying schemes with almost optimal number of queries, each of which involving only a constant number of labels. We also extend this scenario to the case where some workers can be unresponsive. For this case, we propose querying schemes where each query involves only log n items, where n is the total number of items to be labeled. Furthermore, we consider classification of two correlated labeling systems and provide two-stage querying schemes with almost optimal number of queries each involving a constant number of labels.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.10637/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1906.10637/full.md

---
Source: https://tomesphere.com/paper/1906.10637