# SVTR-MG: an optical character recognition network for food packaging spray codes

**Authors:** Sanbo Pan, Peng Wang

PMC · DOI: 10.1038/s41598-025-31995-y · Scientific Reports · 2025-12-12

## TL;DR

This paper introduces SVTR-MG, a new OCR network that improves the accuracy and speed of reading spray codes on food packaging in challenging industrial settings.

## Contribution

The novel SVTR-MG network integrates MDFA and GCSA modules for enhanced character recognition in complex environments.

## Key findings

- SVTR-MG achieves 93.2% recognition accuracy in complex industrial scenarios.
- The model processes at 142 FPS, outperforming mainstream OCR methods by about 5%.
- The dynamic dictionary mapping mechanism improves output alignment during decoding.

## Abstract

Spray codes on product packaging play a critical role in food traceability, quality control, and anti-counterfeiting verification. However, accurate recognition of spray codes in industrial environments remains a significant challenge due to factors such as small character regions, fluctuating print quality, reflective packaging materials, and character deformation. To address these issues, this paper proposes a lightweight improved network named SVTR-MG. The model incorporates a Multi-scale Dilated Feature Aggregation (MDFA) module, which leverages convolutions with varying dilation rates to expand the receptive field and effectively integrate global and local features, thereby enhancing the perception of characters under multi-scale and complex background conditions. Additionally, a Global Context Self-Attention (GCSA) module is introduced, which combines channel and spatial attention mechanisms to model long-range dependencies between characters, improving the network’s robustness to uneven illumination and structural distortions. Furthermore, a dynamic dictionary mapping mechanism is proposed to optimize output alignment during the decoding phase. Experimental results demonstrate that SVTR-MG achieves a recognition accuracy of 93.2% at an inference speed of 142 FPS in complex industrial scenarios, outperforming mainstream OCR methods by approximately 5%, and meeting the real-time and accuracy requirements for deployment in production environments.

## Full-text entities

- **Diseases:** SVTR-T. (MESH:D001260), stroke (MESH:D020521)
- **Chemicals:** GCSA (-), aluminum (MESH:D000535)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12816632/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12816632/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12816632/full.md

---
Source: https://tomesphere.com/paper/PMC12816632