Biclustering random matrix partitions with an application to classification of forensic body fluids
Chieh-Hsi Wu, Amy D. Roeder, Geoff K. Nicholls

TL;DR
This paper introduces BDP-CaRMa, a biclustering Dirichlet process model for classifying unlabeled data with block structures, demonstrated on forensic body fluid classification, offering interpretability and well-calibrated probabilities.
Contribution
It develops a novel hierarchical biclustering Bayesian model that handles multiple matrices with variable row counts, improving interpretability in forensic data classification.
Findings
Effective classification of forensic body fluids using mRNA profiles
Produces interpretable results with well-calibrated posterior probabilities
Applicable to other data types with similar block structures
Abstract
Classification of unlabeled data is usually achieved by supervised learning from labeled samples. Although there exist many sophisticated supervised machine learning methods that can predict the missing labels with a high level of accuracy, they often lack the required transparency in situations where it is important to provide interpretable results and meaningful measures of confidence. Body fluid classification of forensic casework data is the case in point. We develop a new Biclustering Dirichlet Process for Class-assignment with Random Matrices (BDP-CaRMa), with a three-level hierarchy of clustering, and a model-based approach to classification that adapts to block structure in the data matrix. As the class labels of some observations are missing, the number of rows in the data matrix for each class is unknown. BDP-CaRMa handles this and extends existing biclustering methods by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Forensic and Genetic Research
