A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective
Wangyang Ying, Cong Wei, Nanxu Gong, Xinyuan Wang, Haoyue Bai, Arun, Vignesh Malarkkan, Sixun Dong, Dongjie Wang, Denghui Zhang, Yanjie Fu

TL;DR
This survey reviews recent reinforcement learning and generative AI techniques for improving data quality in tabular datasets, focusing on feature selection and generation to enhance model performance across various domains.
Contribution
It systematically analyzes recent advances in RL and generative methods for tabular data optimization, highlighting their roles in automating feature engineering.
Findings
RL and generative methods improve feature engineering automation.
Recent techniques show promising results in real-world applications.
Challenges include data quality and method scalability.
Abstract
Tabular data is one of the most widely used data formats across various domains such as bioinformatics, healthcare, and marketing. As artificial intelligence moves towards a data-centric perspective, improving data quality is essential for enhancing model performance in tabular data-driven applications. This survey focuses on data-driven tabular data optimization, specifically exploring reinforcement learning (RL) and generative approaches for feature selection and feature generation as fundamental techniques for refining data spaces. Feature selection aims to identify and retain the most informative attributes, while feature generation constructs new features to better capture complex data patterns. We systematically review existing generative methods for tabular data engineering, analyzing their latest advancements, real-world applications, and respective strengths and limitations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques
MethodsFeature Selection
