SmallML: Bayesian Transfer Learning for Small-Data Predictive Analytics
Semen Leontev

TL;DR
SmallML is a Bayesian transfer learning framework that enables accurate predictive analytics on small datasets typical of SMEs, combining transfer learning, hierarchical Bayesian modeling, and conformal prediction for uncertainty quantification.
Contribution
The paper introduces SmallML, a novel three-layer Bayesian transfer learning architecture specifically designed for small-data SME applications, achieving high accuracy and reliable uncertainty estimates.
Findings
Achieves 96.7% AUC with 100 observations per business.
Improves prediction accuracy by +24.2 points over independent logistic regression.
Provides finite-sample coverage guarantees with conformal prediction.
Abstract
Small and medium-sized enterprises (SMEs) represent 99.9% of U.S. businesses yet remain systematically excluded from AI due to a mismatch between their operational scale and modern machine learning's data requirements. This paper introduces SmallML, a Bayesian transfer learning framework achieving enterprise-level prediction accuracy with datasets as small as 50-200 observations. We develop a three-layer architecture integrating transfer learning, hierarchical Bayesian modeling, and conformal prediction. Layer 1 extracts informative priors from 22,673 public records using a SHAP-based procedure transferring knowledge from gradient boosting to logistic regression. Layer 2 implements hierarchical pooling across J=5-50 SMEs with adaptive shrinkage, balancing population patterns with entity-specific characteristics. Layer 3 provides conformal sets with finite-sample coverage guarantees…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCustomer churn and segmentation · Imbalanced Data Classification Techniques · Data Quality and Management
