Integrating Human Vision Perception in Vision Transformers for Classifying Waste Items
Akshat Kishore Shrivastava, Tapan Kumar Gandhi

TL;DR
This paper introduces a novel waste classification method inspired by human vision, specifically simulating nystagmus through differential blurring, which improves Vision Transformer's accuracy by 2%.
Contribution
It presents a new approach that integrates human visual perception phenomena into Vision Transformers to enhance waste classification accuracy.
Findings
Outperforms standard Vision Transformer by 2% in waste classification.
Simulates nystagmus through differential blurring to mimic human vision.
Potential for broader application in global issues.
Abstract
In this paper, we propose an novel methodology aimed at simulating the learning phenomenon of nystagmus through the application of differential blurring on datasets. Nystagmus is a biological phenomenon that influences human vision throughout life, notably by diminishing head shake from infancy to adulthood. Leveraging this concept, we address the issue of waste classification, a pressing global concern. The proposed framework comprises two modules, with the second module closely resembling the original Vision Transformer, a state-of-the-art model model in classification tasks. The primary motivation behind our approach is to enhance the model's precision and adaptability, mirroring the real-world conditions that the human visual system undergoes. This novel methodology surpasses the standard Vision Transformer model in waste classification tasks, exhibiting an improvement with a margin…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOcular and Laser Science Research · EEG and Brain-Computer Interfaces · Visual perception and processing mechanisms
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Adam · Layer Normalization · Vision Transformer
