# HSFN: Hierarchical Selection for Fake News Detection building Heterogeneous Ensemble

**Authors:** Sara B. Coutinho, Rafael M.O. Cruz, Francimaria R. S. Nascimento, George D. C. Cavalcanti

arXiv: 2508.21482 · 2025-09-01

## TL;DR

This paper introduces HSFN, a hierarchical classifier selection method that enhances fake news detection by maximizing diversity and performance in ensemble models, outperforming existing methods on multiple datasets.

## Contribution

The paper presents a novel hierarchical selection approach for classifiers that improves ensemble diversity and accuracy in fake news detection tasks.

## Key findings

- Achieves highest accuracy on two of six datasets.
- Effectively balances diversity and performance in classifier selection.
- Outperforms state-of-the-art baselines in ensemble construction.

## Abstract

Psychological biases, such as confirmation bias, make individuals particularly vulnerable to believing and spreading fake news on social media, leading to significant consequences in domains such as public health and politics. Machine learning-based fact-checking systems have been widely studied to mitigate this problem. Among them, ensemble methods are particularly effective in combining multiple classifiers to improve robustness. However, their performance heavily depends on the diversity of the constituent classifiers-selecting genuinely diverse models remains a key challenge, especially when models tend to learn redundant patterns. In this work, we propose a novel automatic classifier selection approach that prioritizes diversity, also extended by performance. The method first computes pairwise diversity between classifiers and applies hierarchical clustering to organize them into groups at different levels of granularity. A HierarchySelect then explores these hierarchical levels to select one pool of classifiers per level, each representing a distinct intra-pool diversity. The most diverse pool is identified and selected for ensemble construction from these. The selection process incorporates an evaluation metric reflecting each classifiers's performance to ensure the ensemble also generalises well. We conduct experiments with 40 heterogeneous classifiers across six datasets from different application domains and with varying numbers of classes. Our method is compared against the Elbow heuristic and state-of-the-art baselines. Results show that our approach achieves the highest accuracy on two of six datasets. The implementation details are available on the project's repository: https://github.com/SaraBCoutinho/HSFN .

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21482/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21482/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/2508.21482/full.md

---
Source: https://tomesphere.com/paper/2508.21482