Feature Ranking for Semi-supervised Learning

Matej Petkovi\'c; Sa\v{s}o D\v{z}eroski; Dragi Kocev

arXiv:2008.03937·cs.LG·August 11, 2020·5 cites

Feature Ranking for Semi-supervised Learning

Matej Petkovi\'c, Sa\v{s}o D\v{z}eroski, Dragi Kocev

PDF

Open Access

TL;DR

This paper introduces a novel semi-supervised feature ranking method applicable to various structured output prediction tasks, demonstrating superior performance over supervised methods across multiple benchmark datasets.

Contribution

It is the first to address feature ranking within semi-supervised structured output prediction, proposing two new approaches based on tree ensembles and Relief algorithms.

Findings

01

Random Forests excel in classification tasks

02

Extra-PCTs perform best in regression tasks

03

Semi-supervised rankings outperform supervised ones in most datasets

Abstract

The data made available for analysis are becoming more and more complex along several directions: high dimensionality, number of examples and the amount of labels per example. This poses a variety of challenges for the existing machine learning methods: coping with dataset with a large number of examples that are described in a high-dimensional space and not all examples have labels provided. For example, when investigating the toxicity of chemical compounds there are a lot of compounds available, that can be described with information rich high-dimensional representations, but not all of the compounds have information on their toxicity. To address these challenges, we propose semi-supervised learning of feature ranking. The feature rankings are learned in the context of classification and regression as well as in the context of structured output prediction (multi-label classification,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Drug Discovery Methods · Text and Document Classification Technologies · Machine Learning and Data Classification