A feature construction framework based on outlier detection and discriminative pattern mining
Albrecht Zimmermann

TL;DR
This paper introduces a general framework for feature construction that uses outlier detection and discriminative pattern mining to enhance data representation, improving classifier performance especially with simple models.
Contribution
The authors propose a novel, general framework for feature construction based on outlier detection and pattern mining, evaluated across diverse datasets.
Findings
Derived features outperform existing methods like DC-Fringe
The approach reduces overfitting compared to other methods
Naive Bayes benefits significantly from the constructed features
Abstract
No matter the expressive power and sophistication of supervised learning algorithms, their effectiveness is restricted by the features describing the data. This is not a new insight in ML and many methods for feature selection, transformation, and construction have been developed. But while this is on-going for general techniques for feature selection and transformation, i.e. dimensionality reduction, work on feature construction, i.e. enriching the data, is by now mainly the domain of image, particularly character, recognition, and NLP. In this work, we propose a new general framework for feature construction. The need for feature construction in a data set is indicated by class outliers and discriminative pattern mining used to derive features on their k-neighborhoods. We instantiate the framework with LOF and C4.5-Rules, and evaluate the usefulness of the derived features on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques · Machine Learning and Data Classification
MethodsSupport Vector Machine
