What am I Searching for: Zero-shot Target Identity Inference in Visual   Search

Mengmi Zhang; Gabriel Kreiman

arXiv:1807.11926·cs.CV·June 3, 2020

What am I Searching for: Zero-shot Target Identity Inference in Visual Search

Mengmi Zhang, Gabriel Kreiman

PDF

Open Access 1 Repo

TL;DR

This paper introduces InferNet, a zero-shot model that infers search targets from eye movement error fixations, outperforming other models without needing object-specific training, thus advancing understanding of intention decoding from visual search behavior.

Contribution

The paper presents a novel zero-shot neural network model, InferNet, that infers search targets from eye movements without prior object-specific training, demonstrating improved accuracy.

Findings

01

InferNet outperforms baseline models in target inference accuracy.

02

Error fixations contain sufficient information to decode search intent.

03

The model generalizes across different search scenarios without additional training.

Abstract

Can we infer intentions from a person's actions? As an example problem, here we consider how to decipher what a person is searching for by decoding their eye movement behavior. We conducted two psychophysics experiments where we monitored eye movements while subjects searched for a target object. We defined the fixations falling on non-target objects as "error fixations". Using those error fixations, we developed a model (InferNet) to infer what the target was. InferNet uses a pre-trained convolutional neural network to extract features from the error fixations and computes a similarity map between the error fixations and all locations across the search image. The model consolidates the similarity maps across layers and integrates these maps across all error fixations. InferNet successfully identifies the subject's goal and outperforms competitive null models, even without any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kreimanlab/HumanIntentionInferenceZeroShot
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Domain Adaptation and Few-Shot Learning · Visual perception and processing mechanisms

MethodsMax Pooling · Convolution