Variational Autoencoder with Embedded Student-$t$ Mixture Model for   Authorship Attribution

Benedikt Boenninghoff; Steffen Zeiler; Robert M. Nickel; Dorothea; Kolossa

arXiv:2005.13930·cs.LG·May 29, 2020

Variational Autoencoder with Embedded Student-$t$ Mixture Model for Authorship Attribution

Benedikt Boenninghoff, Steffen Zeiler, Robert M. Nickel, Dorothea, Kolossa

PDF

TL;DR

This paper introduces a novel variational autoencoder with a Student-$t$ mixture model for authorship attribution, improving robustness to outliers and heavy-tailed data distributions in text classification tasks.

Contribution

It extends the VAE framework by embedding a Student-$t$ mixture model, allowing better modeling of heavy-tailed distributions in authorship attribution.

Findings

01

Superior performance on Amazon review dataset

02

Enhanced robustness to outliers

03

Effective modeling of heavy-tailed data

Abstract

Traditional computational authorship attribution describes a classification task in a closed-set scenario. Given a finite set of candidate authors and corresponding labeled texts, the objective is to determine which of the authors has written another set of anonymous or disputed texts. In this work, we propose a probabilistic autoencoding framework to deal with this supervised classification task. More precisely, we are extending a variational autoencoder (VAE) with embedded Gaussian mixture model to a Student- $t$ mixture model. Autoencoders have had tremendous success in learning latent representations. However, existing VAEs are currently still bound by limitations imposed by the assumed Gaussianity of the underlying probability distributions in the latent space. In this work, we are extending the Gaussian model for the VAE to a Student- $t$ model, which allows for an independent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729 · USD Coin Customer Service Number +1-833-534-1729