Representation Learning from Limited Educational Data with Crowdsourced Labels
Wentao Wang, Guowei Xu, Wenbiao Ding, Gale Yan Huang, Guoliang Li,, Jiliang Tang, Zitao Liu

TL;DR
This paper introduces a novel framework for effective representation learning from limited, noisy, and crowdsourced labels, specifically tailored for educational data, using a grouping neural network, Bayesian confidence estimation, and hard example selection.
Contribution
It proposes a new deep learning framework that handles noisy crowdsourced labels and limited data, improving representation learning in educational contexts.
Findings
Outperforms state-of-the-art baselines on three real-world datasets.
Effectively captures label inconsistency with Bayesian confidence estimator.
Enhances training efficiency through adaptive hard example selection.
Abstract
Representation learning has been proven to play an important role in the unprecedented success of machine learning models in numerous tasks, such as machine translation, face recognition and recommendation. The majority of existing representation learning approaches often require a large number of consistent and noise-free labels. However, due to various reasons such as budget constraints and privacy concerns, labels are very limited in many real-world scenarios. Directly applying standard representation learning approaches on small labeled data sets will easily run into over-fitting problems and lead to sub-optimal solutions. Even worse, in some domains such as education, the limited labels are usually annotated by multiple workers with diverse expertise, which yields noises and inconsistency in such crowdsourcing settings. In this paper, we propose a novel framework which aims to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Mobile Crowdsensing and Crowdsourcing · Machine Learning and Data Classification
