A probabilistic methodology for multilabel classification

Alfonso E. Romero; Luis M. de Campos

arXiv:1201.4777·cs.AI·March 1, 2013

A probabilistic methodology for multilabel classification

Alfonso E. Romero, Luis M. de Campos

PDF

Open Access

TL;DR

This paper introduces a probabilistic methodology that enhances multilabel classification by modeling label co-occurrences, outperforming traditional independent binary classifiers across multiple datasets.

Contribution

It proposes a generic approach to improve multilabel classification by learning label relationships, addressing limitations of the binary method.

Findings

01

Noticeable improvements in classification accuracy across datasets

02

Effective integration of label co-occurrence information

03

Enhanced performance with multiple probabilistic classifiers

Abstract

Multilabel classification is a relatively recent subfield of machine learning. Unlike to the classical approach, where instances are labeled with only one category, in multilabel classification, an arbitrary number of categories is chosen to label an instance. Due to the problem complexity (the solution is one among an exponential number of alternatives), a very common solution (the binary method) is frequently used, learning a binary classifier for every category, and combining them all afterwards. The assumption taken in this solution is not realistic, and in this work we give examples where the decisions for all the labels are not taken independently, and thus, a supervised approach should learn those existing relationships among categories to make a better classification. Therefore, we show here a generic methodology that can improve the results obtained by a set of independent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Spam and Phishing Detection · Machine Learning in Bioinformatics