Learning from missing data with the Latent Block Model

Gabriel Frisch (Heudiasyc); Jean-Benoist L\'eger (Heudiasyc); Yves; Grandvalet (Heudiasyc)

arXiv:2010.12222·cs.LG·October 26, 2020

Learning from missing data with the Latent Block Model

Gabriel Frisch (Heudiasyc), Jean-Benoist L\'eger (Heudiasyc), Yves, Grandvalet (Heudiasyc)

PDF

Open Access

TL;DR

This paper introduces a co-clustering model based on the Latent Block Model to leverage informative missing data, specifically MNAR, improving analysis of complex datasets like voting records.

Contribution

The paper develops a novel co-clustering approach for MNAR data using the Latent Block Model with a variational EM algorithm and model selection, addressing a gap in handling nonignorable missingness.

Findings

01

Effectively identifies meaningful groups in simulated data.

02

Reveals relevant MP and text clusters in French Parliament voting data.

03

Provides interpretable insights into non-voter behavior.

Abstract

Missing data can be informative. Ignoring this information can lead to misleading conclusions when the data model does not allow information to be extracted from the missing data. We propose a co-clustering model, based on the Latent Block Model, that aims to take advantage of this nonignorable nonresponses, also known as Missing Not At Random data (MNAR). A variational expectation-maximization algorithm is derived to perform inference and a model selection criterion is presented. We assess the proposed approach on a simulation study, before using our model on the voting records from the lower house of the French Parliament, where our analysis brings out relevant groups of MPs and texts, together with a sensible interpretation of the behavior of non-voters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Computational and Text Analysis Methods · Bayesian Modeling and Causal Inference