biquality-learn: a Python library for Biquality Learning
Pierre Nodet, Vincent Lemaire, Alexis Bondu, Antoine, Cornu\'ejols

TL;DR
biquality-learn is a Python library that facilitates machine learning from datasets with weak supervision and dataset shifts by leveraging a small trusted dataset, thus addressing real-world data challenges.
Contribution
It introduces an easy-to-use Python library with algorithms for biquality learning, enabling robust model training under supervision weaknesses and dataset shifts.
Findings
Provides a consistent API for biquality learning
Includes well-proven algorithms for handling weak supervision
Enables reproducible research on biquality data
Abstract
The democratization of Data Mining has been widely successful thanks in part to powerful and easy-to-use Machine Learning libraries. These libraries have been particularly tailored to tackle Supervised Learning. However, strong supervision signals are scarce in practice, and practitioners must resort to weak supervision. In addition to weaknesses of supervision, dataset shifts are another kind of phenomenon that occurs when deploying machine learning models in the real world. That is why Biquality Learning has been proposed as a machine learning framework to design algorithms capable of handling multiple weaknesses of supervision and dataset shifts without assumptions on their nature and level by relying on the availability of a small trusted dataset composed of cleanly labeled and representative samples. Thus we propose biquality-learn: a Python library for Biquality Learning with an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques
MethodsLib
