TL;DR
This paper introduces sparseKT, a framework that enhances the robustness and generalization of attention-based knowledge tracing models by selectively focusing on the most relevant student interactions through sparsification techniques.
Contribution
The paper proposes a novel sparsification approach for attention in knowledge tracing models, improving robustness and reducing overfitting on small datasets.
Findings
SparseKT improves model robustness and generalization.
It achieves comparable accuracy to state-of-the-art models.
The approach effectively filters irrelevant interactions.
Abstract
Knowledge tracing (KT) is the problem of predicting students' future performance based on their historical interaction sequences. With the advanced capability of capturing contextual long-term dependency, attention mechanism becomes one of the essential components in many deep learning based KT (DLKT) models. In spite of the impressive performance achieved by these attentional DLKT models, many of them are often vulnerable to run the risk of overfitting, especially on small-scale educational datasets. Therefore, in this paper, we propose \textsc{sparseKT}, a simple yet effective framework to improve the robustness and generalization of the attention based DLKT approaches. Specifically, we incorporate a k-selection module to only pick items with the highest attention scores. We propose two sparsification heuristics : (1) soft-thresholding sparse attention and (2) top- sparse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need
