Information-theoretic Interestingness Measures for Cross-Ontology Data Mining
Prashanti Manda, Fiona McCarthy, Bindu Nanduri, Hui Wang, Susan M., Bridges

TL;DR
This paper introduces a novel data mining approach using ontology-guided generalization and an information-theoretic interestingness metric to discover high-quality relationships across biological ontologies, demonstrated on mouse genome annotation data.
Contribution
It presents a new ontology-guided data mining method and a novel interestingness metric based on information theory for cross-ontology relationship discovery.
Findings
Effective discovery of relationships between developmental stages and gene functions.
The proposed interestingness metric outperforms four existing metrics.
Application to mouse genome data demonstrates practical utility.
Abstract
Community annotation of biological entities with concepts from multiple bio-ontologies has created large and growing repositories of ontology-based annotation data with embedded implicit relationships among orthogonal ontologies. Development of efficient data mining methods and metrics to mine and assess the quality of the mined relationships has not kept pace with the growth of annotation data. In this study, we present a data mining method that uses ontology-guided generalization to discover relationships across ontologies along with a new interestingness metric based on information theory. We apply our data mining algorithm and interestingness measures to datasets from the Gene Expression Database at the Mouse Genome Informatics as a preliminary proof of concept to mine relationships between developmental stages in the mouse anatomy ontology and Gene Ontology concepts (biological…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Bioinformatics and Genomic Networks · Semantic Web and Ontologies
