Semi-supervised logistic discrimination via labeled data and unlabeled   data from different sampling distributions

Shuichi Kawano

arXiv:1108.5244·stat.ML·February 20, 2014·Stat. Anal. Data Min.

Semi-supervised logistic discrimination via labeled data and unlabeled data from different sampling distributions

Shuichi Kawano

PDF

TL;DR

This paper introduces a semi-supervised logistic regression model that effectively handles classification with labeled and unlabeled data from different distributions, utilizing covariate shift adaptation and EM-based regularization.

Contribution

It proposes a novel semi-supervised logistic regression approach with covariate shift adaptation and an information-theoretic model selection criterion.

Findings

01

Model performs well across various scenarios

02

Effective covariate shift adaptation improves classification accuracy

03

Regularization with EM algorithm estimates parameters reliably

Abstract

This article addresses the problem of classification method based on both labeled and unlabeled data, where we assume that a density function for labeled data is different from that for unlabeled data. We propose a semi-supervised logistic regression model for classification problem along with the technique of covariate shift adaptation. Unknown parameters involved in proposed models are estimated by regularization with EM algorithm. A crucial issue in the modeling process is the choices of tuning parameters in our semi-supervised logistic models. In order to select the parameters, a model selection criterion is derived from an information-theoretic approach. Some numerical studies show that our modeling procedure performs well in various cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.