Clustering-Based Approaches for Symbolic Knowledge Extraction

Federico Sabbatini; Roberta Calegari

arXiv:2211.00234·cs.AI·November 2, 2022

Clustering-Based Approaches for Symbolic Knowledge Extraction

Federico Sabbatini, Roberta Calegari

PDF

Open Access

TL;DR

This paper introduces a clustering-based method to improve symbolic knowledge extraction from black-box regressors, especially in high-dimensional or asymmetric data scenarios, enhancing interpretability and performance.

Contribution

It proposes a novel deep clustering approach as a preprocessing step to enhance symbolic knowledge extraction from opaque models, addressing limitations of traditional hypercubic partitioning.

Findings

01

Clustering improves the quality of symbolic rules extracted.

02

The method is effective on high-dimensional and asymmetric datasets.

03

Enhanced interpretability of black-box models achieved.

Abstract

Opaque models belonging to the machine learning world are ever more exploited in the most different application areas. These models, acting as black boxes (BB) from the human perspective, cannot be entirely trusted if the application is critical unless there exists a method to extract symbolic and human-readable knowledge out of them. In this paper we analyse a recurrent design adopted by symbolic knowledge extractors for BB regressors - that is, the creation of rules associated with hypercubic input space regions. We argue that this kind of partitioning may lead to suboptimal solutions when the data set at hand is high-dimensional or does not satisfy symmetric constraints. We then propose a (deep) clustering-based approach to be performed before symbolic knowledge extraction to achieve better performance with data sets of any kind.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Time Series Analysis and Forecasting · Data Mining Algorithms and Applications