The Word2vec Graph Model for Author Attribution and Genre Detection in Literary Analysis
Nafis Irtiza Tripto, Mohammed Eunus Ali

TL;DR
This paper introduces a novel Word2vec graph model for literary analysis tasks like author attribution and genre detection, capturing context and style more effectively than traditional features.
Contribution
The paper presents a new Word2vec graph-based document modeling approach that improves classification accuracy in literary analysis tasks across different datasets.
Findings
Outperforms traditional feature-based methods in author attribution
Effective in genre detection across diverse literary datasets
Code and data are publicly available for reproducibility
Abstract
Analyzing the writing styles of authors and articles is a key to supporting various literary analyses such as author attribution and genre detection. Over the years, rich sets of features that include stylometry, bag-of-words, n-grams have been widely used to perform such analysis. However, the effectiveness of these features largely depends on the linguistic aspects of a particular language and datasets specific characteristics. Consequently, techniques based on these feature sets cannot give desired results across domains. In this paper, we propose a novel Word2vec graph based modeling of a document that can rightly capture both context and style of the document. By using these Word2vec graph based features, we perform classification to perform author attribution and genre detection tasks. Our detailed experimental study with a comprehensive set of literary writings shows the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Topic Modeling · Natural Language Processing Techniques
MethodsSparse Evolutionary Training
