Integrating Human Vision Perception in Vision Transformers for   Classifying Waste Items

Akshat Kishore Shrivastava; Tapan Kumar Gandhi

arXiv:2312.12143·cs.CV·December 21, 2023·1 cites

Integrating Human Vision Perception in Vision Transformers for Classifying Waste Items

Akshat Kishore Shrivastava, Tapan Kumar Gandhi

PDF

Open Access

TL;DR

This paper introduces a novel waste classification method inspired by human vision, specifically simulating nystagmus through differential blurring, which improves Vision Transformer's accuracy by 2%.

Contribution

It presents a new approach that integrates human visual perception phenomena into Vision Transformers to enhance waste classification accuracy.

Findings

01

Outperforms standard Vision Transformer by 2% in waste classification.

02

Simulates nystagmus through differential blurring to mimic human vision.

03

Potential for broader application in global issues.

Abstract

In this paper, we propose an novel methodology aimed at simulating the learning phenomenon of nystagmus through the application of differential blurring on datasets. Nystagmus is a biological phenomenon that influences human vision throughout life, notably by diminishing head shake from infancy to adulthood. Leveraging this concept, we address the issue of waste classification, a pressing global concern. The proposed framework comprises two modules, with the second module closely resembling the original Vision Transformer, a state-of-the-art model model in classification tasks. The primary motivation behind our approach is to enhance the model's precision and adaptability, mirroring the real-world conditions that the human visual system undergoes. This novel methodology surpasses the standard Vision Transformer model in waste classification tasks, exhibiting an improvement with a margin…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOcular and Laser Science Research · EEG and Brain-Computer Interfaces · Visual perception and processing mechanisms

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Adam · Layer Normalization · Vision Transformer