Inference through innovation processes tested in the authorship   attribution task

Giulio Tani Raffaelli; Margherita Lalli; Francesca Tria

arXiv:2306.05186·stat.ME·July 8, 2024·1 cites

Inference through innovation processes tested in the authorship attribution task

Giulio Tani Raffaelli, Margherita Lalli, Francesca Tria

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel approach for authorship attribution using urn models with triggering, leveraging their connection to Bayesian non-parametric inference to improve accuracy, efficiency, and flexibility in analyzing symbolic sequences.

Contribution

It presents a general method for measuring similarity between sequences based on urn models, relaxing exchangeability assumptions and enhancing inference in complex, non-stationary systems.

Findings

01

High accuracy in authorship attribution tasks

02

Significant computational efficiency gains

03

Ability to handle non-stationary, correlated data

Abstract

Urn models for innovation capture fundamental empirical laws shared by several real-world processes. The so-called urn model with triggering includes, as particular cases, the urn representation of the two-parameter Poisson-Dirichlet process and the Dirichlet process, seminal in Bayesian non-parametric inference. In this work, we leverage this connection to introduce a general approach for quantifying closeness between symbolic sequences and test it within the framework of the authorship attribution problem. The method demonstrates high accuracy when compared to other related methods in different scenarios, featuring a substantial gain in computational efficiency and theoretical transparency. Beyond the practical convenience, this work demonstrates how the recently established connection between urn models and non-parametric Bayesian inference can pave the way for designing more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GiulioTani/InnovationProcessesInference
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies