Raw Produce Quality Detection with Shifted Window Self-Attention
Oh Joon Kwon, Byungsoo Kim, Youngduck Choi

TL;DR
This paper evaluates the Swin Transformer model for raw produce quality detection, demonstrating it outperforms CNNs in accuracy and efficiency across various food datasets, marking a significant step towards practical deployment.
Contribution
First large-scale empirical comparison of Swin Transformer and CNNs for raw produce quality detection across multiple food types.
Findings
Swin Transformer achieves better or comparable accuracy than CNNs.
Swin Transformer is more data- and compute-efficient.
This is the first large-scale study on RPQD using Transformer models.
Abstract
Global food insecurity is expected to worsen in the coming decades with the accelerated rate of climate change and the rapidly increasing population. In this vein, it is important to remove inefficiencies at every level of food production. The recent advances in deep learning can help reduce such inefficiencies, yet their application has not yet become mainstream throughout the industry, inducing economic costs at a massive scale. To this point, modern techniques such as CNNs (Convolutional Neural Networks) have been applied to RPQD (Raw Produce Quality Detection) tasks. On the other hand, Transformer's successful debut in the vision among other modalities led us to expect a better performance with these Transformer-based models in RPQD. In this work, we exclusively investigate the recent state-of-the-art Swin (Shifted Windows) Transformer which computes self-attention in both intra-…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Sensor Technologies · Spectroscopy and Chemometric Analyses · Smart Agriculture and AI
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Stochastic Depth · Label Smoothing · Absolute Position Encodings · Residual Connection · Softmax · Swin Transformer · Adam
