Rethinking Graph-Based Document Classification: Learning Data-Driven Structures Beyond Heuristic Approaches
Margarita Bugue\~no, Gerard de Melo

TL;DR
This paper introduces a data-driven method for constructing graph structures in document classification, replacing heuristic approaches with learned dependencies, leading to improved accuracy and robustness.
Contribution
It proposes a novel approach to learn graph structures using self-attention, reducing reliance on heuristics and domain-specific rules in document classification.
Findings
Learned graphs outperform heuristic-based graphs in accuracy and F1 score.
Statistical filtering enhances classification robustness.
The approach generalizes well across multiple datasets.
Abstract
In document classification, graph-based models effectively capture document structure, overcoming sequence length limitations and enhancing contextual understanding. However, most existing graph document representations rely on heuristics, domain-specific rules, or expert knowledge. Unlike previous approaches, we propose a method to learn data-driven graph structures, eliminating the need for manual design and reducing domain dependence. Our approach constructs homogeneous weighted graphs with sentences as nodes, while edges are learned via a self-attention model that identifies dependencies between sentence pairs. A statistical filtering strategy aims to retain only strongly correlated sentences, improving graph quality while reducing the graph size. Experiments on three document classification datasets demonstrate that learned graphs consistently outperform heuristic-based graphs,…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. This paper introduces a self-attention-based approach that eliminates the dependency on heuristics and domain expertise. 2. Experiment results have verified the the effectiveness of proposed model.
1. The novelty is not enough as only applying graph attention neural networks to document graph tasks.
- Simplicity and generality. - Empirical evidence.
- Limited task scope. - Weak theoretical justification.
- The proposed graph inference method is indeed more adaptable to different datasets than the different heuristic graph constructions that are listed by the authors. - The overview Figure 2 provides a good understanding of the approach.
- Evaluating on only three datasets is not a lot and the transformer baselines you use change for each dataset. I think the empirical evidence in favour of your method should be extended for the method to really be of proven practical relevance. - I am not convinced that the methodological contribution or the empirical work offer sufficient novelty to warrant publication at the ICLR conference. - You say that heterogeneous graphs are "not comparable" to your homogenous construction and are t
1. This paper pointed out some limitations of existing works on text classification, especially graph-based frameworks. 2. The proposed framework could achieve better performance on document classification tasks on selected benchmark dataset.
- Lack of Novelty: The proposed framework applies self-attention to model correlations between sentences within a document, which is a well-established approach. Moreover, the handling of repeated sentences ignores contextual information, and the method treats sentence order in a bag-of-words manner without modeling reading sequence. - Limited Evaluation: The experimental validation is insufficient. The framework is only tested on limited text classification settings. Broader evaluation is expe
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Text and Document Classification Technologies · Topic Modeling
