Explainable Machine Learning for Categorical and Mixed Data with   Lossless Visualization

Boris Kovalerchuk; Elijah McCoy

arXiv:2305.18437·cs.LG·November 27, 2023·1 cites

Explainable Machine Learning for Categorical and Mixed Data with Lossless Visualization

Boris Kovalerchuk, Elijah McCoy

PDF

Open Access

TL;DR

This paper introduces methods for encoding, visualizing, and explaining machine learning models on mixed categorical and numeric data, emphasizing lossless visualization and rule generation for interpretability.

Contribution

It develops numeric coding schemes, a toolkit for interpretability, and a new Sequential Rule Generation algorithm for explainable models on mixed data.

Findings

01

Successful evaluation of SRG algorithm in experiments

02

Effective lossless visualization of n-D categorical data

03

Enhanced interpretability of ML models on mixed data

Abstract

Building accurate and interpretable Machine Learning (ML) models for heterogeneous/mixed data is a long-standing challenge for algorithms designed for numeric data. This work focuses on developing numeric coding schemes for non-numeric attributes for ML algorithms to support accurate and explainable ML models, methods for lossless visualization of n-D non-numeric categorical data with visual rule discovery in these visualizations, and accurate and explainable ML models for categorical data. This study proposes a classification of mixed data types and analyzes their important role in Machine Learning. It presents a toolkit for enforcing interpretability of all internal operations of ML algorithms on mixed data with a visual data exploration on mixed data. A new Sequential Rule Generation (SRG) algorithm for explainable rule generation with categorical data is proposed and successfully…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification