
TL;DR
This paper explores how coalgebras can formalize and measure the syntactic behavior of texts in stylometry, enabling quantitative comparison of texts through a probabilistic transition system framework.
Contribution
It introduces a coalgebraic approach to model text behavior and proposes a polynomial-time algorithm to approximate behavioral distances for text comparison.
Findings
Coalgebraic models effectively capture syntactic features.
Behavioral distance quantifies differences between texts.
Approximation algorithm is computationally efficient.
Abstract
The syntactic behaviour of texts can highly vary depending on their contexts (e.g. author, genre, etc.). From the standpoint of stylometry, it can be helpful to objectively measure this behaviour. In this paper, we discuss how coalgebras are used to formalise the notion of behaviour by embedding syntactic features of a given text into probabilistic transition systems. By introducing the behavioural distance, we are then able to quantitatively measure differences between points in these systems and thus, comparing features of different texts. Furthermore, the behavioural distance of points can be approximated by a polynomial-time algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Authorship Attribution and Profiling · Rough Sets and Fuzzy Logic
