Explaining Differences in Classes of Discrete Sequences

Samaneh Saadat; Gita Sukthankar

arXiv:2011.03371·cs.HC·November 9, 2020

Explaining Differences in Classes of Discrete Sequences

Samaneh Saadat, Gita Sukthankar

PDF

TL;DR

This paper introduces methods to interpret and explain differences between classes of discrete sequences, aiding understanding of human behavior and sequence classification models.

Contribution

The paper presents novel techniques for analyzing and interpreting differences between sequence classes, enhancing explainability of sequence classification models.

Findings

01

Silhouette score comparison of k-gram representations reveals class differences.

02

Distance matrix analysis characterizes key differences between sequence groups.

03

Applied methods successfully distinguished bot and non-bot GitHub team sequences.

Abstract

While there are many machine learning methods to classify and cluster sequences, they fail to explain what are the differences in groups of sequences that make them distinguishable. Although in some cases having a black box model is sufficient, there is a need for increased explainability in research areas focused on human behaviors. For example, psychologists are less interested in having a model that predicts human behavior with high accuracy and more concerned with identifying differences between actions that lead to divergent human behavior. This paper presents techniques for understanding differences between classes of discrete sequences. Approaches introduced in this paper can be utilized to interpret black box machine learning models on sequences. The first approach compares k-gram representations of sequences using the silhouette score. The second method characterizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.