Learning to Rank from Samples of Variable Quality
Mostafa Dehghani, Jaap Kamps

TL;DR
This paper introduces fidelity-weighted learning (FWL), a semi-supervised approach that leverages both high-quality and weakly-labeled data by estimating label confidence to improve deep neural network training.
Contribution
The paper proposes a novel semi-supervised student-teacher framework that accounts for label quality, enhancing learning from mixed-quality datasets.
Findings
FWL outperforms state-of-the-art semi-supervised methods in document ranking.
The approach effectively utilizes weakly-labeled data with confidence weighting.
Experimental results demonstrate improved ranking performance.
Abstract
Training deep neural networks requires many training samples, but in practice, training labels are expensive to obtain and may be of varying quality, as some may be from trusted expert labelers while others might be from heuristics or other sources of weak supervision such as crowd-sourcing. This creates a fundamental quality-versus quantity trade-off in the learning process. Do we learn from the small amount of high-quality data or the potentially large amount of weakly-labeled data? We argue that if the learner could somehow know and take the label-quality into account when learning the data representation, we could get the best of both worlds. To this end, we introduce "fidelity-weighted learning" (FWL), a semi-supervised student-teacher approach for training deep neural networks using weakly-labeled data. FWL modulates the parameter updates to a student network (trained on the task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Machine Learning and Algorithms
