Online Label Aggregation: A Variational Bayesian Approach
Chi Hong, Amirmasoud Ghiassi, Yichi Zhou, Robert Birke, Lydia Y. Chen

TL;DR
This paper introduces BiLA, a novel online label aggregation framework using variational Bayesian inference, capable of incrementally inferring true labels efficiently in crowdsourced data scenarios.
Contribution
BiLA is a new online label aggregation method that employs variational Bayesian inference with a stochastic optimization scheme for incremental training, accommodating any label generating distribution.
Findings
BiLA reduces label inference error rate by 10-15% on synthetic data.
BiLA outperforms state-of-the-art methods like minimax entropy and neural networks.
BiLA demonstrates effective real-world application in online label aggregation scenarios.
Abstract
Noisy labeled data is more a norm than a rarity for crowd sourced contents. It is effective to distill noise and infer correct labels through aggregation results from crowd workers. To ensure the time relevance and overcome slow responses of workers, online label aggregation is increasingly requested, calling for solutions that can incrementally infer true label distribution via subsets of data items. In this paper, we propose a novel online label aggregation framework, BiLA, which employs variational Bayesian inference method and designs a novel stochastic optimization scheme for incremental training. BiLA is flexible to accommodate any generating distribution of labels by the exact computation of its posterior distribution. We also derive the convergence bound of the proposed optimizer. We compare BiLA with the state of the art based on minimax entropy, neural networks and expectation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · Anomaly Detection Techniques and Applications
