Using Entropy Estimates for DAG-Based Ontologies

Andrew Warren; Joao Setubal

arXiv:1403.4887·cs.CL·June 20, 2017·2 cites

Using Entropy Estimates for DAG-Based Ontologies

Andrew Warren, Joao Setubal

PDF

Open Access

TL;DR

This paper introduces a new method for calculating entropy in DAG-based ontologies to improve the estimation of information content for semantic similarity, addressing limitations of traditional frequency-based approaches.

Contribution

It presents a novel entropy calculation for DAG-based ontologies and compares it with existing information content metrics using semantic and sequence similarity.

Findings

01

New entropy-based IC metric shows improved correlation with semantic similarity

02

Method outperforms traditional frequency-based IC calculations

03

Enhanced accuracy in gene annotation similarity assessments

Abstract

Motivation: Entropy measurements on hierarchical structures have been used in methods for information retrieval and natural language modeling. Here we explore its application to semantic similarity. By finding shared ontology terms, semantic similarity can be established between annotated genes. A common procedure for establishing semantic similarity is to calculate the descriptiveness (information content) of ontology terms and use these values to determine the similarity of annotations. Most often information content is calculated for an ontology term by analyzing its frequency in an annotation corpus. The inherent problems in using these values to model functional similarity motivates our work. Summary: We present a novel calculation for establishing the entropy of a DAG-based ontology, which can be used in an alternative method for establishing the information content of its terms.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques