InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
Jacob Si, Wendy Yusi Cheng, Michael Cooper, Rahul G. Krishnan

TL;DR
InterpreTabNet enhances neural network interpretability for tabular data by modeling attention as a sparse, latent variable, enabling clearer feature importance insights and improved predictive performance.
Contribution
This paper introduces InterpreTabNet, a novel variant of TabNet that uses Gumbel-Softmax sampling and KL regularization to produce sparse, interpretable attention masks.
Findings
Outperforms previous interpretability methods on real datasets
Achieves higher feature sparsity and clarity in explanations
Maintains competitive predictive accuracy
Abstract
Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. However, the inferred attention masks are often dense, making it challenging to come up with rationales about the predictive signal. To remedy this, we propose InterpreTabNet, a variant of the TabNet model that models the attention mechanism as a latent variable sampled from a Gumbel-Softmax distribution. This enables us to regularize the model to learn distinct concepts in the attention masks via a KL Divergence regularizer. It prevents overlapping feature selection by promoting sparsity which maximizes the model's efficacy and improves interpretability to determine the important features when predicting the outcome. To assist in the interpretation of feature interdependencies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Image Processing and 3D Reconstruction
MethodsGated Linear Unit · Batch Normalization · Residual Connection · Dense Connections · TabNet · Feature Selection
