Cost-Aware Learning for Improved Identifiability with Multiple   Experiments

Longyun Guo; Jean Honorio; John Morgan

arXiv:1802.04350·cs.LG·July 16, 2019

Cost-Aware Learning for Improved Identifiability with Multiple Experiments

Longyun Guo, Jean Honorio, John Morgan

PDF

Open Access

TL;DR

This paper investigates how a cost-aware learning approach across multiple experiments can enhance model identifiability and reduce the gap between training and generalization errors, considering budget constraints.

Contribution

It introduces a framework analyzing sample complexity with multiple experiments and costs, demonstrating improved identifiability and error bounds using Rademacher complexity.

Findings

01

Learning from multiple experiments improves identifiability.

02

The generalization gap scales as O(C^{-1/2}) with total cost C.

03

Applicable to linear models, neural networks, and kernel methods.

Abstract

We analyze the sample complexity of learning from multiple experiments where the experimenter has a total budget for obtaining samples. In this problem, the learner should choose a hypothesis that performs well with respect to multiple experiments, and their related data distributions. Each collected sample is associated with a cost which depends on the particular experiments. In our setup, a learner performs $m$ experiments, while incurring a total cost $C$ . We first show that learning from multiple experiments allows to improve identifiability. Additionally, by using a Rademacher complexity approach, we show that the gap between the training and generalization error is $O (C^{- 1/2})$ . We also provide some examples for linear prediction, two-layer neural networks and kernel methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Fault Detection and Control Systems