GP-Tree: A Gaussian Process Classifier for Few-Shot Incremental Learning
Idan Achituve, Aviv Navon, Yochai Yemini, Gal Chechik, Ethan Fetaya

TL;DR
GP-Tree introduces a hierarchical Gaussian process model combined with deep kernel learning, enabling scalable and accurate multi-class classification in few-shot incremental learning scenarios.
Contribution
It presents a novel tree-based hierarchical GP model that scales efficiently with data and class size, improving incremental few-shot learning performance.
Findings
Outperforms existing GP training baselines.
Achieves higher accuracy on incremental few-shot benchmarks.
Scales well with data size and number of classes.
Abstract
Gaussian processes (GPs) are non-parametric, flexible, models that work well in many tasks. Combining GPs with deep learning methods via deep kernel learning (DKL) is especially compelling due to the strong representational power induced by the network. However, inference in GPs, whether with or without DKL, can be computationally challenging on large datasets. Here, we propose GP-Tree, a novel method for multi-class classification with Gaussian processes and DKL. We develop a tree-based hierarchical model in which each internal node of the tree fits a GP to the data using the P\'olya Gamma augmentation scheme. As a result, our method scales well with both the number of classes and data size. We demonstrate the effectiveness of our method against other Gaussian process training baselines, and we show how our general GP approach achieves improved accuracy on standard incremental few-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Air Quality Monitoring and Forecasting · Domain Adaptation and Few-Shot Learning
MethodsDeep Kernel Learning · Greedy Policy Search · Gaussian Process · Data augmentation using Polya-Gamma latent variables.
