# Laser-printed document classification using random forest and gray prediction models

**Authors:** Yinxuan Qu, Chenyang Yu, Chunhui Li, Yuzhu Yang

PMC · DOI: 10.1016/j.isci.2025.114131 · iScience · 2025-11-21

## TL;DR

A new method combines random forest and gray prediction models to accurately classify laser-printed documents for forensic use.

## Contribution

Proposes a hybrid model integrating random forest and gray prediction for printer source classification.

## Key findings

- The hybrid model achieves 96.00% accuracy for Chinese characters and 92.86% for punctuation marks.
- The method offers non-destructive, efficient, and stable document analysis for forensic applications.

## Abstract

This paper presents a classification method for laser-printed documents, integrating the random forest algorithm with the gray prediction model to enhance the accuracy and reliability of forensic document examination. The study utilizes 14 laser printers from five different brands as experimental subjects and extracts 14 key feature parameters such as gray mean, contrast, and distribution symmetry using the ImageXpert analysis system. Classification is done by the random forest algorithm, and the gray prediction model is used to enhance accuracy of classification. Finally, experimental results show that the proposed method achieves high precision or accuracy (96.00% for Chinese characters with fewer strokes and 92.86% for punctuation marks [periods]) for the character and punctuation classification. Compared to traditional classification methods, this approach exhibits superior stability and accuracy. The findings highlight the advantages of non-destructive analysis, efficient classification, and robustness, underscoring its potential as a valuable technological tool for forensic document examination in legal contexts.

•Integrates random forest and gray prediction for printer source classification•Hybrid model achieves 96.00% accuracy for Chinese characters and 92.86% for punctuation•Offers a non-destructive, efficient, and stable document analysis methodology•Enhances forensic traceability of printed documents with a data-driven methodology

Integrates random forest and gray prediction for printer source classification

Hybrid model achieves 96.00% accuracy for Chinese characters and 92.86% for punctuation

Offers a non-destructive, efficient, and stable document analysis methodology

Enhances forensic traceability of printed documents with a data-driven methodology

Applied sciences; Network

## Full-text entities

- **Diseases:** stroke (MESH:D020521), multi-stroke (MESH:D015161)
- **Chemicals:** GM(1,1) (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12757639/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12757639/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/PMC12757639/full.md

---
Source: https://tomesphere.com/paper/PMC12757639