A Gap Between Decision Trees and Neural Networks
Akash Kumar

TL;DR
This paper investigates the geometric complexity of decision boundaries in neural networks versus decision trees, showing that simple decision regions can be difficult for shallow ReLU networks to approximate accurately.
Contribution
It introduces the Radon total variation seminorm to analyze the geometric complexity of neural network level sets and demonstrates the inherent complexity gap with decision trees.
Findings
Decision tree indicator functions have infinite RTV.
Common smooth surrogates also have infinite RTV in higher dimensions.
A constructed smooth score function can exactly recover decision sets with finite RTV.
Abstract
We study when geometric simplicity of decision boundaries, used here as a notion of interpretability, can conflict with accurate approximation of axis-aligned decision trees by shallow neural networks. Decision trees induce rule-based, axis-aligned decision regions (finite unions of boxes), whereas shallow ReLU networks are typically trained as score models whose predictions are obtained by thresholding. We analyze the infinite-width, bounded-norm, single-hidden-layer ReLU class through the Radon total variation () seminorm, which controls the geometric complexity of level sets. We first show that the hard tree indicator has infinite . Moreover, two natural split-wise continuous surrogates--piecewise-linear ramp smoothing and sigmoidal (logistic) smoothing--also have infinite in dimensions , while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
