On the Use of Unrealistic Predictions in Hundreds of Papers Evaluating Graph Representations
Li-Chung Lin, Cheng-Hung Liu, Chih-Ming Chen, Kai-Chin Hsu, I-Feng Wu,, Ming-Feng Tsai, Chih-Jen Lin

TL;DR
This paper highlights the widespread use of unrealistic ground truth predictions in graph representation evaluation, demonstrating how it inflates performance estimates and proposing practical alternatives for future research.
Contribution
It identifies the problematic assumption of known label counts in node classification, analyzes its causes, and offers simple, realistic evaluation settings for future studies.
Findings
Unrealistic ground truth assumptions inflate performance metrics.
Most existing evaluations rely on impractical label information.
Proposed evaluation settings avoid using unknown label data.
Abstract
Prediction using the ground truth sounds like an oxymoron in machine learning. However, such an unrealistic setting was used in hundreds, if not thousands of papers in the area of finding graph representations. To evaluate the multi-label problem of node classification by using the obtained representations, many works assume in the prediction stage that the number of labels of each test instance is known. In practice such ground truth information is rarely available, but we point out that such an inappropriate setting is now ubiquitous in this research area. We detailedly investigate why the situation occurs. Our analysis indicates that with unrealistic information, the performance is likely over-estimated. To see why suitable predictions were not used, we identify difficulties in applying some multi-label techniques. For the use in future studies, we propose simple and effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsText and Document Classification Technologies · Advanced Graph Neural Networks · Advanced Text Analysis Techniques
