Data Augmentation for Sparse Multidimensional Learning Performance Data   Using Generative AI

Liang Zhang; Jionghao Lin; John Sabatini; Conrad Borchers; Daniel; Weitekamp; Meng Cao; John Hollander; Xiangen Hu; Arthur C. Graesser

arXiv:2409.15631·cs.LG·January 7, 2025

Data Augmentation for Sparse Multidimensional Learning Performance Data Using Generative AI

Liang Zhang, Jionghao Lin, John Sabatini, Conrad Borchers, Daniel, Weitekamp, Meng Cao, John Hollander, Xiangen Hu, Arthur C. Graesser

PDF

Open Access 1 Repo

TL;DR

This paper introduces a tensor-based data augmentation framework using GANs and GPT to address the high sparsity in learning performance data, improving predictive accuracy in adaptive learning systems.

Contribution

It presents a novel systematic approach combining tensor factorization and generative AI models to augment sparse learner data for better performance prediction.

Findings

01

Tensor factorization enhances knowledge tracing accuracy.

02

GAN-based data generation offers more stable and less biased results than GPT.

03

Augmentation improves performance prediction in adaptive learning systems.

Abstract

Learning performance data describe correct and incorrect answers or problem-solving attempts in adaptive learning, such as in intelligent tutoring systems (ITSs). Learning performance data tend to be highly sparse (80\%\(\sim\)90\% missing observations) in most real-world applications due to adaptive item selection. This data sparsity presents challenges to using learner models to effectively predict future performance explore new hypotheses about learning. This article proposes a systematic framework for augmenting learner data to address data sparsity in learning performance data. First, learning performance is represented as a three-dimensional tensor of learners' questions, answers, and attempts, capturing longitudinal knowledge states during learning. Second, a tensor factorization method is used to impute missing values in sparse tensors of collected learner data, thereby…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liangzhang2017/3dgai
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOnline Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Cosine Annealing · Multi-Head Attention · Weight Decay · Linear Warmup With Cosine Annealing · Adam · Residual Connection · Byte Pair Encoding