GSSF: A Generative Sequence Similarity Function based on a Seq2Seq model for clustering online handwritten mathematical answers
Huy Quang Ung, Cuong Tuan Nguyen, Hung Tuan Nguyen, Masaki Nakagawa

TL;DR
This paper introduces GSSF, a generative sequence similarity function based on a Seq2Seq model, to improve clustering of online handwritten mathematical expressions for assisting in computer-aided marking.
Contribution
It proposes a novel similarity measure using a Seq2Seq recognizer and demonstrates its effectiveness in clustering handwritten math answers.
Findings
Achieved high clustering purity (~0.916) on mixed datasets.
Reduced marking cost effectively compared to previous methods.
Outperformed existing clustering approaches for online handwritten math expressions.
Abstract
Toward a computer-assisted marking for descriptive math questions,this paper presents clustering of online handwritten mathematical expressions (OnHMEs) to help human markers to mark them efficiently and reliably. We propose a generative sequence similarity function for computing a similarity score of two OnHMEs based on a sequence-to-sequence OnHME recognizer. Each OnHME is represented by a similarity-based representation (SbR) vector. The SbR matrix is inputted to the k-means algorithm for clustering OnHMEs. Experiments are conducted on an answer dataset (Dset_Mix) of 200 OnHMEs mixed of real patterns and synthesized patterns for each of 10 questions and a real online handwritten mathematical answer dataset of 122 student answers at most for each of 15 questions (NIER_CBT). The best clustering results achieved around 0.916 and 0.915 for purity, and around 0.556 and 0.702 for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
