Deep neural networks can be improved using human-derived contextual expectations
Harish Katti, Marius V. Peelen, S. P. Arun

TL;DR
This paper demonstrates that integrating human-derived scene expectations into deep neural networks significantly enhances object detection accuracy by leveraging independently learned contextual cues.
Contribution
The study introduces a novel method of using human-derived scene expectations to improve deep neural network performance in object detection tasks.
Findings
Augmenting neural networks with predicted human expectations improves detection accuracy by 1-20%.
Humans show systematic scene expectations that can be predicted from scene features.
Contextual expectations help reduce false alarms and improve target localization.
Abstract
Real-world objects occur in specific contexts. Such context has been shown to facilitate detection by constraining the locations to search. But can context directly benefit object detection? To do so, context needs to be learned independently from target features. This is impossible in traditional object detection where classifiers are trained on images containing both target features and surrounding context. In contrast, humans can learn context and target features separately, such as when we see highways without cars. Here we show for the first time that human-derived scene expectations can be used to improve object detection performance in machines. To measure contextual expectations, we asked human subjects to indicate the scale, location and likelihood at which cars or people might occur in scenes without these objects. Humans showed highly systematic expectations that we could…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Video Surveillance and Tracking Methods
