Evaluating the performance-deviation of itemKNN in RecBole and LensKit
Michael Schmidt, Jannik Nitschke, Tim Prinz

TL;DR
This paper compares the performance of itemKNN algorithms in RecBole and LensKit libraries across multiple datasets, identifying key differences in similarity calculations that affect accuracy metrics and demonstrating how adjustments can align their results.
Contribution
It provides a detailed analysis of performance deviations between RecBole and LensKit's itemKNN implementations and proposes modifications to standardize their comparison.
Findings
RecBole outperforms LensKit on ML-100K in nDCG, precision, and recall.
Aligning nDCG calculations reduces performance differences.
Similarity matrix calculation differences are the main cause of deviations.
Abstract
This study examines the performance of item-based k-Nearest Neighbors (ItemKNN) algorithms in the RecBole and LensKit recommender system libraries. Using four data sets (Anime, Modcloth, ML-100K, and ML-1M), we assess each library's efficiency, accuracy, and scalability, focusing primarily on normalized discounted cumulative gain (nDCG). Our results show that RecBole outperforms LensKit on two of three metrics on the ML-100K data set: it achieved an 18% higher nDCG, 14% higher precision, and 35% lower recall. To ensure a fair comparison, we adjusted LensKit's nDCG calculation to match RecBole's method. This alignment made the performance more comparable, with LensKit achieving an nDCG of 0.2540 and RecBole 0.2674. Differences in similarity matrix calculations were identified as the main cause of performance deviations. After modifying LensKit to retain only the top K similar items, both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Advanced Graph Neural Networks · Text and Document Classification Technologies
MethodsSparse Evolutionary Training
