Classification Analysis Of Authorship Fiction Texts in The Space Of Semantic Fields
Bohdan Pavlyshenko

TL;DR
This paper evaluates naive Bayesian and kNN classifiers for authorship attribution in English fiction texts, using semantic field frequency vectors, revealing distinct author-specific semantic patterns.
Contribution
It introduces a semantic field-based vector space approach for authorship classification and compares the effectiveness of NB and kNN classifiers in this context.
Findings
High classification accuracy indicates distinct semantic patterns per author.
Semantic fields of nouns and verbs effectively differentiate authors.
The approach highlights specific semantic spheres linked to individual author styles.
Abstract
The use of naive Bayesian classifier (NB) and the classifier by the k nearest neighbors (kNN) in classification semantic analysis of authors' texts of English fiction has been analysed. The authors' works are considered in the vector space the basis of which is formed by the frequency characteristics of semantic fields of nouns and verbs. Highly precise classification of authors' texts in the vector space of semantic fields indicates about the presence of particular spheres of author's idiolect in this space which characterizes the individual author's style.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling
