MN-DS: A Multilabeled News Dataset for News Articles Hierarchical Classification
Alina Petukhova, Nuno Fachada

TL;DR
This paper introduces MN-DS, a comprehensive hierarchical news dataset with over 10,000 articles, designed to facilitate machine learning research in news classification and topic prediction.
Contribution
The paper provides a new, manually labeled hierarchical news dataset with detailed category taxonomy for improved news article classification.
Findings
Dataset contains 10,917 articles with hierarchical labels.
Useful for training machine learning models for news classification.
Supports research in news structuring and event prediction.
Abstract
This article presents a dataset of 10,917 news articles with hierarchical news categories collected between 1 January 2019 and 31 December 2019. We manually labeled the articles based on a hierarchical taxonomy with 17 first-level and 109 second-level categories. This dataset can be used to train machine learning models for automatically classifying news articles by topic. This dataset can be helpful for researchers working on news structuring, classification, and predicting future events based on released news.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Web Data Mining and Analysis · Text and Document Classification Technologies
