Book Success Prediction with Pretrained Sentence Embeddings and Readability Scores
Muhammad Khalifa, Aminul Islam

TL;DR
This paper introduces a novel approach for predicting book success using pretrained sentence embeddings and readability scores, outperforming previous methods without relying on lexical or syntactic features.
Contribution
The study presents a new model that combines pretrained sentence embeddings with readability scores for book success prediction, requiring only the first 1,000 sentences.
Findings
Model outperforms strong baselines by up to 6.4% F1-score
Only the first 1,000 sentences are needed for accurate prediction
No count-based, lexical, or syntactic features are required
Abstract
Predicting the potential success of a book in advance is vital in many applications. This could help both publishers and readers in their decision-making process whether or not a book is worth publishing and reading, respectively. In this paper, we propose a model that leverages pretrained sentence embeddings along with various readability scores for book success prediction. Unlike previous methods, the proposed method requires no count-based, lexical, or syntactic features. Instead, we use a convolutional neural network over pretrained sentence embeddings and leverage different readability scores through a simple concatenation operation. Our proposed model outperforms strong baselines for this task by as large as 6.4\% F1-score points. Moreover, our experiments show that according to our model, only the first 1K sentences are good enough to predict the potential success of books.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Advanced Text Analysis Techniques
