# Enhancing book genre classification with BERT and InceptionV3: a deep learning approach for libraries

**Authors:** Xinting Yang, Zehua Zhang

PMC · DOI: 10.7717/peerj-cs.2934 · PeerJ Computer Science · 2025-06-05

## TL;DR

This paper introduces a deep learning model combining BERT and InceptionV3 to improve book genre classification using both text and images.

## Contribution

A novel hybrid deep learning model that integrates visual and textual features for book genre classification.

## Key findings

- The proposed model achieves a balanced accuracy of 0.7951 on the BookCover30 dataset.
- It outperforms standalone image- and text-based classifiers with an F1-score of 0.7920.

## Abstract

Accurate book genre classification is essential for library organization, information retrieval, and personalized recommendations. Traditional classification methods, often reliant on manual categorization and metadata-based approaches, struggle with the complexities of hybrid genres and evolving literary trends. To address these limitations, this study proposes a hybrid deep learning model that integrates visual and textual features for enhanced genre classification. Specifically, we employ InceptionV3, an advanced convolutional neural network architecture, to extract visual features from book cover images and bidirectional encoder representations from transformers (BERT) to analyze textual data from book titles. A scaled dot-product attention mechanism is used to effectively fuse these multimodal features, dynamically weighting their contributions based on contextual relevance. Experimental results on the BookCover30 dataset demonstrate that our proposed model outperforms baseline approaches, achieving a balanced accuracy of 0.7951 and an F1-score of 0.7920, surpassing both standalone image- and text-based classifiers. This study highlights the potential of deep learning in improving automated genre classification, offering a scalable and adaptable solution for libraries and digital platforms. Future research may focus on expanding dataset diversity, optimizing computational efficiency, and addressing biases in classification models.

## Full-text entities

- **Chemicals:** BERT (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12193415/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12193415/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC12193415/full.md

---
Source: https://tomesphere.com/paper/PMC12193415