Explainable Mixed Data Representation and Lossless Visualization Toolkit for Knowledge Discovery
Boris Kovalerchuk, Elijah McCoy

TL;DR
This paper introduces a toolkit for interpretable machine learning and lossless visualization tailored for heterogeneous, multidimensional mixed data, facilitating knowledge discovery and domain insights.
Contribution
It proposes a novel classification of mixed data types and develops an experimental toolkit combining data editing, visualization, and rule discovery for better interpretability.
Findings
Toolkit enables lossless visualization of multidimensional mixed data.
Supports interpretable ML models for heterogeneous data.
Available on GitHub for community use.
Abstract
Developing Machine Learning (ML) algorithms for heterogeneous/mixed data is a longstanding problem. Many ML algorithms are not applicable to mixed data, which include numeric and non-numeric data, text, graphs and so on to generate interpretable models. Another longstanding problem is developing algorithms for lossless visualization of multidimensional mixed data. The further progress in ML heavily depends on success interpretable ML algorithms for mixed data and lossless interpretable visualization of multidimensional data. The later allows developing interpretable ML models using visual knowledge discovery by end-users, who can bring valuable domain knowledge which is absent in the training data. The challenges for mixed data include: (1) generating numeric coding schemes for non-numeric attributes for numeric ML algorithms to provide accurate and interpretable ML models, (2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Data Mining Algorithms and Applications · Explainable Artificial Intelligence (XAI)
