A Text-based Approach For Link Prediction on Wikipedia Articles
Anh Hoang Tran, Tam Minh Nguyen, Son T. Luu

TL;DR
This paper explores a text-based machine learning approach using POS tags to predict links between Wikipedia articles, achieving high accuracy and competitive ranking in a challenge.
Contribution
It introduces a novel application of POS tag features with traditional ML models for Wikipedia link prediction, demonstrating competitive performance.
Findings
F1 score of 0.99999 achieved
Ranked 7th in DSAA 2023 Challenge
Source code publicly available
Abstract
This paper present our work in the DSAA 2023 Challenge about Link Prediction for Wikipedia Articles. We use traditional machine learning models with POS tags (part-of-speech tags) features extracted from text to train the classification model for predicting whether two nodes has the link. Then, we use these tags to test on various machine learning models. We obtained the results by F1 score at 0.99999 and got 7th place in the competition. Our source code is publicly available at this link: https://github.com/Tam1032/DSAA2023-Challenge-Link-prediction-DS-UIT_SAT
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration
