Substructure Discovery Using Minimum Description Length and Background Knowledge
D. J. Cook, L. B. Holder

TL;DR
This paper introduces an enhanced substructure discovery system, SUBDUE, based on the minimum description length principle, capable of identifying hierarchical, compressive, and conceptually meaningful substructures in structural data.
Contribution
The paper presents a new version of SUBDUE that incorporates background knowledge and approximate graph matching to improve substructure discovery and hierarchical data representation.
Findings
SUBDUE effectively compresses data by discovering meaningful substructures.
The system can incorporate background knowledge to guide discovery.
Experiments demonstrate SUBDUE's ability to find important structural concepts.
Abstract
The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by SUBDUE to guide the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Rough Sets and Fuzzy Logic · Software Engineering Research
