Deduplicating and Ranking Solution Programs for Suggesting Reference Solutions
Atsushi Shirafuji, Yutaka Watanobe

TL;DR
This paper proposes a deduplication and ranking method for solution programs in online judges to reduce redundancy and help learners access diverse, representative solutions efficiently.
Contribution
It introduces a novel approach to remove near-duplicate solutions and rank unique programs by popularity, significantly reducing the number of solutions users need to review.
Findings
Number of programs reduced by 60.20% after deduplication.
Top-10 programs cover nearly 30% of solutions.
Users only need to refer to about 40% of programs on average.
Abstract
Referring to solution programs written by other users is helpful for learners in programming education. However, current online judge systems just list all solution programs submitted by users for references, and the programs are sorted based on the submission date and time, execution time, or user rating, ignoring to what extent the programs can be helpful to be referenced. In addition, users struggle to refer to a variety of solution approaches since there are too many duplicated and near-duplicated programs. To motivate learners to refer to various solutions to learn better solution approaches, in this paper, we propose an approach to deduplicate and rank common solution programs in each programming problem. Inspired by the nature that the many-duplicated program adopts a more common approach and can be a general reference, we remove the near-duplicated solution programs and rank the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Online Learning and Analytics · Software Engineering Techniques and Practices
