TL;DR
This paper introduces ImageNet-HSJ, a large dataset of human similarity judgments and psychological embeddings that enables evaluation of models on task-general human perception and reasoning, surpassing previous datasets in scale and utility.
Contribution
The paper presents a new large-scale dataset and embedding space for human similarity judgments on ImageNet, with innovative sampling methods for scaling and evaluation of model alignment with human perception.
Findings
Complex models do not necessarily align better with human similarity judgments.
The new dataset and embeddings facilitate comprehensive evaluation of models against human perception.
The methodology improves scaling and quality of psychological embeddings.
Abstract
Advances in object recognition flourished in part because of the availability of high-quality datasets and associated benchmarks. However, these benchmarks---such as ILSVRC---are relatively task-specific, focusing predominately on predicting class labels. We introduce a publicly-available dataset that embodies the task-general capabilities of human perception and reasoning. The Human Similarity Judgments extension to ImageNet (ImageNet-HSJ) is composed of human similarity judgments that supplement the ILSVRC validation set. The new dataset supports a range of task and performance metrics, including the evaluation of unsupervised learning algorithms. We demonstrate two methods of assessment: using the similarity judgments directly and using a psychological embedding trained on the similarity judgments. This embedding space contains an order of magnitude more points (i.e., images) than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
