Robust Model-Based Clustering of Voting Records
Yang Tang, Paul D. McNicholas, Antonio Punzo

TL;DR
This paper introduces a novel mixture model based on contaminated normal distributions to identify extreme voting patterns in U.S. Congressional records, effectively clustering binary data with potential outliers.
Contribution
It presents the first mixture model for binary data that captures extreme patterns using contaminated normal distributions, applied to voting record analysis.
Findings
Successfully identified extreme voting patterns in Congressional data
Demonstrated the model's ability to cluster binary data with outliers
First application of contaminated normal mixture model to binary voting data
Abstract
We explore the possibility of discovering extreme voting patterns in the U.S. Congressional voting records by drawing ideas from the mixture of contaminated normal distributions. A mixture of latent trait models via contaminated normal distributions is proposed. We assume that the low dimensional continuous latent variable comes from a contaminated normal distribution and, therefore, picks up extreme patterns in the observed binary data while clustering. We consider in particular such model for the analysis of voting records. The model is applied to a U.S. Congressional Voting data set on 16 issues. Note this approach is the first instance within the literature of a mixture model handling binary data with possible extreme patterns.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Natural Language Processing Techniques · Bayesian Methods and Mixture Models
