Maximum Relevance and Minimum Redundancy Feature Selection Methods for a   Marketing Machine Learning Platform

Zhenyu Zhao; Radhika Anand; Mallory Wang

arXiv:1908.05376·stat.ML·August 16, 2019·23 cites

Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform

Zhenyu Zhao, Radhika Anand, Mallory Wang

PDF

Open Access 2 Repos

TL;DR

This paper extends and evaluates mRMR feature selection methods for marketing machine learning at Uber, introducing non-linear redundancy and model-based relevance measures, leading to improved feature selection in large-scale classification tasks.

Contribution

The paper introduces novel non-linear redundancy and model-based relevance measures to enhance mRMR feature selection for marketing applications.

Findings

01

Extended mRMR methods outperform existing techniques in empirical tests.

02

Selected mRMR method improves model accuracy and efficiency.

03

Implementation in production demonstrates practical benefits.

Abstract

In machine learning applications for online product offerings and marketing strategies, there are often hundreds or thousands of features available to build such models. Feature selection is one essential method in such applications for multiple objectives: improving the prediction accuracy by eliminating irrelevant features, accelerating the model training and prediction speed, reducing the monitoring and maintenance workload for feature data pipeline, and providing better model interpretation and diagnosis capability. However, selecting an optimal feature subset from a large feature space is considered as an NP-complete problem. The mRMR (Minimum Redundancy and Maximum Relevance) feature selection framework solves this problem by selecting the relevant features while controlling for the redundancy within the selected features. This paper describes the approach to extend, evaluate, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Data Stream Mining Techniques

MethodsFeature Selection