Modern Neural Networks for Small Tabular Datasets: The New Default for Field-Scale Digital Soil Mapping?
Viacheslav Barkov, Jonas Schmidinger, Robin Gebbers, Martin Atzmueller

TL;DR
This study benchmarks modern neural network architectures for soil property prediction in field-scale digital soil mapping, showing they outperform classical methods and recommending TabPFN as the new default tool.
Contribution
It provides a comprehensive evaluation of recent ANN models for small tabular datasets in pedometrics, demonstrating their superiority over traditional algorithms.
Findings
Modern ANNs outperform classical methods on most datasets.
TabPFN achieves the best overall performance and robustness.
Deep learning is now a viable default for field-scale soil property prediction.
Abstract
In the field of pedometrics, tabular machine learning is the predominant method for soil property prediction from remote and proximal soil sensing data, forming a central component of Digital Soil Mapping (DSM). At the field-scale, this predictive soil modeling (PSM) task is typically constrained by small training sample sizes and high feature-to-sample ratios in soil spectroscopy. Traditionally, these conditions have proven challenging for conventional deep learning methods. Classical machine learning algorithms, particularly tree-based models like Random Forest and linear models such as Partial Least Squares Regression, have long been the default choice for pedometric modeling within DSM. Recent advances in artificial neural networks (ANN) for tabular data challenge this view, yet their suitability for field-scale DSM has not been proven. We introduce a comprehensive benchmark that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
