Quantifying Human Bias and Knowledge to guide ML models during Training

Hrishikesh Viswanath; Andrey Shor; Yoshimasa Kitaguchi

arXiv:2211.10796·cs.LG·November 22, 2022

Quantifying Human Bias and Knowledge to guide ML models during Training

Hrishikesh Viswanath, Andrey Shor, Yoshimasa Kitaguchi

PDF

Open Access

TL;DR

This paper introduces a crowdsourcing method to quantify human-perceived feature importance, guiding ML models to better handle biased datasets and improve classification outcomes.

Contribution

It presents a novel approach of using human rankings of feature importance to initialize model weights, enhancing learning on skewed datasets.

Findings

01

Human-guided initial weights improve model accuracy.

02

Aggregated human opinions help mitigate dataset bias.

03

Method effective on neural networks and SVMs for binary classification.

Abstract

This paper discusses a crowdsourcing based method that we designed to quantify the importance of different attributes of a dataset in determining the outcome of a classification problem. This heuristic, provided by humans acts as the initial weight seed for machine learning models and guides the model towards a better optimal during the gradient descent process. Often times when dealing with data, it is not uncommon to deal with skewed datasets, that over represent items of certain classes, while underrepresenting the rest. Skewed datasets may lead to unforeseen issues with models such as learning a biased function or overfitting. Traditional data augmentation techniques in supervised learning include oversampling and training with synthetic data. We introduce an experimental approach to dealing with such unbalanced datasets by including humans in the training process. We ask humans to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Machine Learning and Data Classification