Benchmarking Omni-Vision Representation through the Lens of Visual   Realms

Yuanhan Zhang; Zhenfei Yin; Jing Shao; Ziwei Liu

arXiv:2207.07106·cs.CV·July 18, 2022·1 cites

Benchmarking Omni-Vision Representation through the Lens of Visual Realms

Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces OmniBenchmark, a comprehensive dataset covering diverse visual realms, and proposes ReCo, a contrastive learning method that improves omni-vision representations by leveraging semantic relations.

Contribution

The paper presents a new benchmark dataset for evaluating omni-vision models and a novel contrastive learning framework that encodes semantic relations to enhance generalization.

Findings

01

ReCo outperforms other supervised contrastive methods.

02

OmniBenchmark covers most visual realms without semantic overlap.

03

ReCo improves omni-vision representations across architectures.

Abstract

Though impressive performance has been achieved in specific visual realms (e.g. faces, dogs, and places), an omni-vision representation generalizing to many natural visual domains is highly desirable. But, existing benchmarks are biased and inefficient to evaluate the omni-vision representation -- these benchmarks either only include several specific realms, or cover most realms at the expense of subsuming numerous datasets that have extensive realm overlapping. In this paper, we propose Omni-Realm Benchmark (OmniBenchmark). It includes 21 realm-wise datasets with 7,372 concepts and 1,074,346 images. Without semantic overlapping, these datasets cover most visual realms comprehensively and meanwhile efficiently. In addition, we propose a new supervised contrastive learning framework, namely Relational Contrastive learning (ReCo), for a better omni-vision representation. Beyond pulling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ZhangYuanhan-AI/OmniBenchmark
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection · Video Surveillance and Tracking Methods

MethodsContrastive Learning