Combining Prediction Intervals on Multi-Source Non-Disclosed Regression   Datasets

Ola Spjuth; Robin Carri\'on Br\"annstr\"om; Lars Carlsson; Niharika; Gauraha

arXiv:1908.05571·stat.ML·August 16, 2019·1 cites

Combining Prediction Intervals on Multi-Source Non-Disclosed Regression Datasets

Ola Spjuth, Robin Carri\'on Br\"annstr\"om, Lars Carlsson, Niharika, Gauraha

PDF

Open Access

TL;DR

This paper introduces Non-Disclosed Conformal Prediction (NDCP), a method for combining prediction intervals from multiple non-disclosable data sources in regression tasks, ensuring validity and improving efficiency over single-source predictions.

Contribution

The paper proposes NDCP, a novel approach for aggregating conformal prediction intervals from separate data sources without data pooling, enhancing prediction efficiency.

Findings

01

NDCP produces conservatively valid prediction intervals.

02

Efficiency improves compared to single-source predictions.

03

Method performs well across varying data source sizes.

Abstract

Conformal Prediction is a framework that produces prediction intervals based on the output from a machine learning algorithm. In this paper we explore the case when training data is made up of multiple parts available in different sources that cannot be pooled. We here consider the regression case and propose a method where a conformal predictor is trained on each data source independently, and where the prediction intervals are then combined into a single interval. We call the approach Non-Disclosed Conformal Prediction (NDCP), and we evaluate it on a regression dataset from the UCI machine learning repository using support vector regression as the underlying machine learning algorithm, with varying number of data sources and sizes. The results show that the proposed method produces conservatively valid prediction intervals, and while we cannot retain the same efficiency as when all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Statistical Methods and Inference