A Multi-Dimensional Clustering Approach for Identifying Inborn Errors of Immunity
Nishad Kulkarni, Alexandra K. Martinson, Nicholas L. Rider, Michael Keller, Syed Muhammad Anwar

TL;DR
This paper introduces a multi-dimensional clustering pipeline that processes complex immunologic EHR data to identify novel inborn errors of immunity patterns, aiding early diagnosis of rare diseases.
Contribution
It develops a new data curation and clustering methodology specifically designed for complex medical records in IEI, enhancing pattern recognition and feature extraction.
Findings
Transforming raw immunologic lab data into ML-compatible vectors.
Effective hyperparameter tuning improves disease pattern recognition.
Pipeline identifies novel IEI features from national registry data.
Abstract
Rare diseases such as inborn errors of immunity (IEI) require early diagnosis to prevent end organ damage and improve quality of life. Hurdles in accessing and curating large scale electronic health record (EHR) data limit routine data driven analyses to remain on the forefront of IEI and other rare disease trends. Development of machine learning (ML) algorithms in IEI for pattern recognition as well as published methodology examining how to systematically process and integrate complex medical data is limited. Our proposed pipeline, including data curation and ML clustering algorithms, is designed to recognize novel rare disease patterns and extract IEI- associated features from a national data registry. Our methodology for EHR data formatting and processing presents the pipeline that transforms raw immunologic lab data into vectors. This is further combined with hyperparameter tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
