Self-Supervised Contextual Bandits in Computer Vision

Aniket Anand Deshmukh; Abhimanu Kumar; Levi Boyles; Denis Charles,; Eren Manavoglu; Urun Dogan

arXiv:2003.08485·cs.CV·March 20, 2020·1 cites

Self-Supervised Contextual Bandits in Computer Vision

Aniket Anand Deshmukh, Abhimanu Kumar, Levi Boyles, Denis Charles,, Eren Manavoglu, Urun Dogan

PDF

Open Access

TL;DR

This paper introduces a novel method combining self-supervised learning with contextual bandits to improve reward optimization in computer vision tasks, demonstrating significant gains across multiple datasets.

Contribution

It proposes a new approach that integrates self-supervision into contextual bandit algorithms, addressing the lack of implicit labels in early learning stages.

Findings

01

Substantial improvements in cumulative reward on eight datasets

02

Identification of cases where the method underperforms and alternative solutions

03

Enhanced data representation learning for better decision-making

Abstract

Contextual bandits are a common problem faced by machine learning practitioners in domains as diverse as hypothesis testing to product recommendations. There have been a lot of approaches in exploiting rich data representations for contextual bandit problems with varying degree of success. Self-supervised learning is a promising approach to find rich data representations without explicit labels. In a typical self-supervised learning scheme, the primary task is defined by the problem objective (e.g. clustering, classification, embedding generation etc.) and the secondary task is defined by the self-supervision objective (e.g. rotation prediction, words in neighborhood, colorization, etc.). In the usual self-supervision, we learn implicit labels from the training data for a secondary task. However, in the contextual bandit setting, we don't have the advantage of getting implicit labels…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning