Closing the gap on tabular data with Fourier and Implicit Categorical Features
Marius Dragoi, Florin Gogianu, Elena Burceanu

TL;DR
This paper introduces a novel approach combining statistical feature processing and Learned Fourier techniques to enhance deep learning performance on tabular data, bridging the gap with tree-based methods.
Contribution
It proposes a new preprocessing method and Fourier-based bias mitigation to improve neural network effectiveness on tabular datasets, outperforming traditional deep models.
Findings
Deep models with proposed preprocessing outperform baseline neural networks.
Achieves performance comparable or superior to XGBoost on benchmark datasets.
Significantly reduces the performance gap between neural networks and tree-based methods.
Abstract
While Deep Learning has demonstrated impressive results in applications on various data types, it continues to lag behind tree-based methods when applied to tabular data, often referred to as the last "unconquered castle" for neural networks. We hypothesize that a significant advantage of tree-based methods lies in their intrinsic capability to model and exploit non-linear interactions induced by features with categorical characteristics. In contrast, neural-based methods exhibit biases toward uniform numerical processing of features and smooth solutions, making it challenging for them to effectively leverage such patterns. We address this performance gap by using statistical-based feature processing techniques to identify features that are strongly correlated with the target once discretized. We further mitigate the bias of deep models for overly-smooth solutions, a bias that does not…
Peer Reviews
Decision·Submitted to ICLR 2024
The question the paper addresses is highly relevant, and the methods that the paper proposes seem very interesting to investigate in this context. The paper does a thorough benchmark using an established protocol and show improvements over XGBoost. The authors summarize the current literature on the topic well.
While the authors summarize the current literature well, the comparison between the literature and the proposed method is somewhat lacking. While it is not feasible to reimplement all the competing methods and evaluating them, at least some of them should be compared. In particular Kadra showed that simple networks can perform well, if tuned correctly, while this paper only uses a very limited search space for the baseline MLP. The authors use a measure that I'm unfamiliar with for evaluation,
-The work is introduced very well - the motivation is clear and the existing work is very nicely summarized. -The authors show impressive results with the proposed method - showing improved accuracy / R2 score on average across many datasets
-None of the methods are defined or explained in enough detail to understand what is being done and reproduce the results. I.e., the particular model used and how it is applied, the fourier features, and the implicit categorical feature selection - none of these are clearly explained or described in enough detail to reproduce what was done. Furthermore, what is described does not make much sense the way it is currently described. As one example, it's stated at the beginning of modeling, a 1D
- The paper is well presented. - Evaluations are intensive/solid with good analysis.
- There's is limited technical contribution in the paper. - The method here is more like feature engineering work that requires a lot of tuning to perform well. - The performance improvement is marginal compared to XGBOOST.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI) · Model Reduction and Neural Networks
