Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform
Zhenyu Zhao, Radhika Anand, Mallory Wang

TL;DR
This paper extends and evaluates mRMR feature selection methods for marketing machine learning at Uber, introducing non-linear redundancy and model-based relevance measures, leading to improved feature selection in large-scale classification tasks.
Contribution
The paper introduces novel non-linear redundancy and model-based relevance measures to enhance mRMR feature selection for marketing applications.
Findings
Extended mRMR methods outperform existing techniques in empirical tests.
Selected mRMR method improves model accuracy and efficiency.
Implementation in production demonstrates practical benefits.
Abstract
In machine learning applications for online product offerings and marketing strategies, there are often hundreds or thousands of features available to build such models. Feature selection is one essential method in such applications for multiple objectives: improving the prediction accuracy by eliminating irrelevant features, accelerating the model training and prediction speed, reducing the monitoring and maintenance workload for feature data pipeline, and providing better model interpretation and diagnosis capability. However, selecting an optimal feature subset from a large feature space is considered as an NP-complete problem. The mRMR (Minimum Redundancy and Maximum Relevance) feature selection framework solves this problem by selecting the relevant features while controlling for the redundancy within the selected features. This paper describes the approach to extend, evaluate, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Data Stream Mining Techniques
MethodsFeature Selection
