# Nonparametric Compositional Stochastic Optimization for Risk-Sensitive   Kernel Learning

**Authors:** Amrit Singh Bedi, Alec Koppel, Ketan Rajawat, and Panchajanya Sanyal

arXiv: 1902.06011 · 2020-11-30

## TL;DR

This paper introduces COLK, a memory-efficient stochastic algorithm for risk-sensitive kernel learning in compositional optimization problems within RKHS, demonstrating convergence and robustness in supervised learning tasks.

## Contribution

It presents the first memory-efficient stochastic method for compositional problems in RKHS, with proven convergence and practical robustness.

## Key findings

- COLK reliably converges in experiments.
- It attains consistent performance across training runs.
- It overcomes overfitting in supervised learning.

## Abstract

In this work, we address optimization problems where the objective function is a nonlinear function of an expected value, i.e., compositional stochastic {strongly convex programs}. We consider the case where the decision variable is not vector-valued but instead belongs to a reproducing Kernel Hilbert Space (RKHS), motivated by risk-aware formulations of supervised learning and Markov Decision Processes defined over continuous spaces.   We develop the first memory-efficient stochastic algorithm for this setting, which we call Compositional Online Learning with Kernels (COLK). COLK, at its core a two-time-scale stochastic approximation method, addresses the fact that (i) compositions of expected value problems cannot be addressed by classical stochastic gradient due to the presence of the inner expectation; and (ii) the RKHS-induced parameterization has complexity which is proportional to the iteration index which is mitigated through greedily constructed subspace projections. We establish almost sure convergence of COLK with attenuating step-sizes, and linear convergence in mean to a neighborhood with constant step-sizes, as well as the fact that its complexity is at-worst finite. The experiments with robust formulations of supervised learning demonstrate that COLK reliably converges, attains consistent performance across training runs, and thus overcomes overfitting.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.06011/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1902.06011/full.md

## References

62 references — full list in the complete paper: https://tomesphere.com/paper/1902.06011/full.md

---
Source: https://tomesphere.com/paper/1902.06011