Towards Automation of Knowledge Understanding: An Approach for Probabilistic Generative Classifiers
Dominik Fisch, Christian Gruhl, Edgar Kalkowski, Bernhard Sick, Seppo, J. Ovaska

TL;DR
This paper introduces objective, quantitative measures for evaluating the quality and usefulness of rules generated by probabilistic generative classifiers, aiding knowledge understanding in data mining.
Contribution
It proposes novel objective metrics for assessing rules in probabilistic generative classifiers, enhancing interpretability and application of mined knowledge.
Findings
Objective measures for rule informativeness and importance
Metrics support evaluation and improvement of data mining results
Case studies demonstrate practical utility of measures
Abstract
After data selection, pre-processing, transformation, and feature extraction, knowledge extraction is not the final step in a data mining process. It is then necessary to understand this knowledge in order to apply it efficiently and effectively. Up to now, there is a lack of appropriate techniques that support this significant step. This is partly due to the fact that the assessment of knowledge is often highly subjective, e.g., regarding aspects such as novelty or usefulness. These aspects depend on the specific knowledge and requirements of the data miner. There are, however, a number of aspects that are objective and for which it is possible to provide appropriate measures. In this article we focus on classification problems and use probabilistic generative classifiers based on mixture density models that are quite common in data mining applications. We define objective measures to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Algorithms and Data Compression · Time Series Analysis and Forecasting
