Explainable Machine Learning for Categorical and Mixed Data with Lossless Visualization
Boris Kovalerchuk, Elijah McCoy

TL;DR
This paper introduces methods for encoding, visualizing, and explaining machine learning models on mixed categorical and numeric data, emphasizing lossless visualization and rule generation for interpretability.
Contribution
It develops numeric coding schemes, a toolkit for interpretability, and a new Sequential Rule Generation algorithm for explainable models on mixed data.
Findings
Successful evaluation of SRG algorithm in experiments
Effective lossless visualization of n-D categorical data
Enhanced interpretability of ML models on mixed data
Abstract
Building accurate and interpretable Machine Learning (ML) models for heterogeneous/mixed data is a long-standing challenge for algorithms designed for numeric data. This work focuses on developing numeric coding schemes for non-numeric attributes for ML algorithms to support accurate and explainable ML models, methods for lossless visualization of n-D non-numeric categorical data with visual rule discovery in these visualizations, and accurate and explainable ML models for categorical data. This study proposes a classification of mixed data types and analyzes their important role in Machine Learning. It presents a toolkit for enforcing interpretability of all internal operations of ML algorithms on mixed data with a visual data exploration on mixed data. A new Sequential Rule Generation (SRG) algorithm for explainable rule generation with categorical data is proposed and successfully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
