Sirius: Visualization of Mixed Features as a Mutual Information Network Graph
Jane L. Adams, Todd F. Deluca, Christopher M. Danforth, Peter S., Dodds, Yuhang Zheng, Konstantinos Anastasakis, Boyoon Choi, Allison Min,, Michael M. Bessey

TL;DR
Sirius is a visualization tool that uses mutual information to explore relationships among mixed data features, aiding data scientists in understanding dependencies before modeling.
Contribution
It introduces a novel information theoretic visualization package supporting heterogeneous data types with an interactive network interface.
Findings
Supports network visualization of mixed data types
Helps identify meaningful feature dependencies
Aids in feature selection and data quality assessment
Abstract
Data scientists across disciplines are increasingly in need of exploratory analysis tools for data sets with a high volume of features of mixed data type (quantitative continuous and discrete categorical). We introduce Sirius, a novel visualization package for researchers to explore feature relationships among mixed data types using mutual information. The visualization of feature relationships aids data scientists in finding meaningful dependence among features prior to the development of predictive modeling pipelines, which can inform downstream analysis such as feature selection, feature extraction, and early detection of potential proxy variables. Using an information theoretic approach, Sirius supports network visualization of heterogeneous data sets (consisting of continuous and discrete data types), and provides a user interface for exploring feature pairs with locally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Mental Health Research Topics · Data Visualization and Analytics
