Model Selection for Support Vector Machine Classification

Carl Gold; Peter Sollich

arXiv:cond-mat/0203334·cond-mat.dis-nn·May 23, 2007·Neurocomputing

Model Selection for Support Vector Machine Classification

Carl Gold, Peter Sollich

PDF

TL;DR

This paper explores probabilistic model selection for SVMs by deriving evidence-based criteria and gradient ascent methods, comparing their effectiveness with traditional error estimates through extensive experiments.

Contribution

It introduces an evidence-based model selection framework for SVMs, including exact gradient computation and a gradient ascent algorithm, and compares it with existing methods.

Findings

01

Evidence-based criteria have fewer local optima than simpler methods.

02

Exact evidence-based model selection yields more consistent and lower test errors.

03

Gradient ascent on evidence outperforms traditional error estimates in stability and accuracy.

Abstract

We address the problem of model selection for Support Vector Machine (SVM) classification. For fixed functional form of the kernel, model selection amounts to tuning kernel parameters and the slack penalty coefficient $C$ . We begin by reviewing a recently developed probabilistic framework for SVM classification. An extension to the case of SVMs with quadratic slack penalties is given and a simple approximation for the evidence is derived, which can be used as a criterion for model selection. We also derive the exact gradients of the evidence in terms of posterior averages and describe how they can be estimated numerically using Hybrid Monte Carlo techniques. Though computationally demanding, the resulting gradient ascent algorithm is a useful baseline tool for probabilistic SVM model selection, since it can locate maxima of the exact (unapproximated) evidence. We then perform extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.