Stochastic gradient descent with gradient estimator for categorical   features

Paul Peseux; Maxime Berar; Thierry Paquet; Victor Nicollet

arXiv:2209.03771·cs.LG·April 19, 2023

Stochastic gradient descent with gradient estimator for categorical features

Paul Peseux, Maxime Berar, Thierry Paquet, Victor Nicollet

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel gradient estimator tailored for sparse, one-hot encoded categorical data, improving optimization in machine learning models applied to such data types.

Contribution

It proposes a new gradient estimator specifically designed for sparse categorical features, enhancing model training and interpretability.

Findings

01

The new estimator outperforms existing ones on multiple datasets.

02

It improves model convergence and accuracy with categorical data.

03

A real-world retail dataset is released for further research.

Abstract

Categorical data are present in key areas such as health or supply chain, and this data require specific treatment. In order to apply recent machine learning models on such data, encoding is needed. In order to build interpretable models, one-hot encoding is still a very good solution, but such encoding creates sparse data. Gradient estimators are not suited for sparse data: the gradient is mainly considered as zero while it simply does not always exists, thus a novel gradient estimator is introduced. We show what this estimator minimizes in theory and show its efficiency on different datasets with multiple model architectures. This new estimator performs better than common estimators under similar settings. A real world retail dataset is also released after anonymization. Overall, the aim of this paper is to thoroughly consider categorical data and adapt models and optimizers to these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ppmdatix/gce
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Machine Learning and Algorithms