Context Sensitivity Improves Human-Machine Visual Alignment
Frieda Born, Tom Neuh\"auser, Lukas Muttenthaler, Brett D. Roads, Bernhard Spitzer, Andrew K. Lampinen, Matt Jones, Klaus-Robert M\"uller, Michael C. Mozer

TL;DR
This paper introduces a context-sensitive similarity method for neural embeddings, significantly improving human-aligned visual task performance by adapting to environmental context.
Contribution
It presents a novel approach for context-sensitive similarity computation that enhances alignment with human visual perception in neural network models.
Findings
Achieved up to 15% accuracy improvement in odd-one-out tasks.
Improvement is consistent across original and human-aligned models.
Modeling context enhances human-like visual understanding.
Abstract
Modern machine learning models typically represent inputs as fixed points in a high-dimensional embedding space. While this approach has been proven powerful for a wide range of downstream tasks, it fundamentally differs from the way humans process information. Because humans are constantly adapting to their environment, they represent objects and their relationships in a highly context-sensitive manner. To address this gap, we propose a method for context-sensitive similarity computation from neural network embeddings, applied to modeling a triplet odd-one-out task with an anchor image serving as simultaneous context. Modeling context enables us to achieve up to a 15% improvement in odd-one-out accuracy over a context-insensitive model. We find that this improvement is consistent across both original and "human-aligned" vision foundation models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
