Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing
Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, Yanjie Fu

TL;DR
This paper introduces a novel framework for self-optimizing feature generation that combines categorical hashing and hierarchical reinforcement crossing to efficiently generate meaningful features.
Contribution
It proposes a generic representation-crossing framework with a three-step hashing approach and hierarchical reinforcement crossing for improved feature generation.
Findings
Demonstrates effectiveness in generating meaningful features.
Shows improved efficiency over existing methods.
Validates robustness across datasets.
Abstract
Feature generation aims to generate new and meaningful features to create a discriminative representation space.A generated feature is meaningful when the generated feature is from a feature pair with inherent feature interaction. In the real world, experienced data scientists can identify potentially useful feature-feature interactions, and generate meaningful dimensions from an exponentially large search space, in an optimal crossing form over an optimal generation path. But, machines have limited human-like abilities.We generalize such learning tasks as self-optimizing feature generation. Self-optimizing feature generation imposes several under-addressed challenges on existing systems: meaningful, robust, and efficient generation. To tackle these challenges, we propose a principled and generic representation-crossing framework to solve self-optimizing feature generation.To achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Machine Learning and Data Classification
