Hierarchy exploitation to detect missing annotations on hierarchical multi-label classification
Miguel Romero, Felipe Kenji Nakano, Jorge Finke, Camilo Rocha, Celine, Vens

TL;DR
This paper introduces a hierarchy-aware method for detecting missing gene function annotations in hierarchical multi-label datasets, improving prediction accuracy by leveraging class hierarchy information.
Contribution
It presents a novel approach that exploits class hierarchy through aggregated probabilities to identify missing annotations, demonstrating superior performance over existing methods.
Findings
Hierarchy exploitation improves annotation prediction accuracy.
The method outperforms existing approaches on rice genomic data.
Incorporating hierarchy aids in identifying missing gene functions.
Abstract
The availability of genomic data has grown exponentially in the last decade, mainly due to the development of new sequencing technologies. Based on the interactions between genes (and gene products) extracted from the increasing genomic data, numerous studies have focused on the identification of associations between genes and functions. While these studies have shown great promise, the problem of annotating genes with functions remains an open challenge. In this work, we present a method to detect missing annotations in hierarchical multi-label classification datasets. We propose a method that exploits the class hierarchy by computing aggregated probabilities to the paths of classes from the leaves to the root for each instance. The proposed method is presented in the context of predicting missing gene function annotations, where these aggregated probabilities are further used to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Text and Document Classification Technologies · Gene expression and cancer classification
