Optimal Clustering from Noisy Binary Feedback
Kaito Ariu, Jungseul Ok, Alexandre Proutiere, Se-Young Yun

TL;DR
This paper addresses the challenge of clustering items based on noisy binary feedback, proposing algorithms that nearly achieve theoretical limits and demonstrating the benefits of adaptive selection strategies.
Contribution
It introduces near-optimal clustering algorithms from binary feedback, including an adaptive method that efficiently allocates queries based on item difficulty and question relevance.
Findings
The algorithms nearly match information-theoretic lower bounds.
Adaptive selection improves clustering accuracy and efficiency.
Numerical experiments confirm the advantage of adaptive strategies.
Abstract
We study the problem of clustering a set of items from binary user feedback. Such a problem arises in crowdsourcing platforms solving large-scale labeling tasks with minimal effort put on the users. For example, in some of the recent reCAPTCHA systems, users clicks (binary answers) can be used to efficiently label images. In our inference problem, items are grouped into initially unknown non-overlapping clusters. To recover these clusters, the learner sequentially presents to users a finite list of items together with a question with a binary answer selected from a fixed finite set. For each of these items, the user provides a noisy answer whose expectation is determined by the item cluster and the question and by an item-specific parameter characterizing the {\it hardness} of classifying the item. The objective is to devise an algorithm with a minimal cluster recovery error rate. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Advanced Bandit Algorithms Research · Machine Learning and Algorithms
