Self-supervised Hypergraphs for Learning Multiple World Interpretations

Alina Marcu; Mihai Pirvu; Dragos Costea; Emanuela Haller; Emil; Slusanschi; Ahmed Nabil Belbachir; Rahul Sukthankar; Marius Leordeanu

arXiv:2308.07615·cs.CV·August 22, 2023

Self-supervised Hypergraphs for Learning Multiple World Interpretations

Alina Marcu, Mihai Pirvu, Dragos Costea, Emanuela Haller, Emil, Slusanschi, Ahmed Nabil Belbachir, Rahul Sukthankar, Marius Leordeanu

PDF

Open Access

TL;DR

This paper introduces a self-supervised hypergraph approach for learning multiple scene representations and improving pretrained vision transformers using minimal labeled data, demonstrating superior performance on a new UAV dataset.

Contribution

The paper proposes a novel multi-task hypergraph framework for scene interpretation and enhances pretrained models without additional labels, introducing the Dronescapes dataset.

Findings

01

Hypergraph-based multi-task learning outperforms existing models.

02

Ensemble methods improve robustness and pseudolabel quality.

03

The approach effectively leverages minimal labeled data for complex scene understanding.

Abstract

We present a method for learning multiple scene representations given a small labeled set, by exploiting the relationships between such representations in the form of a multi-task hypergraph. We also show how we can use the hypergraph to improve a powerful pretrained VisTransformer model without any additional labeled data. In our hypergraph, each node is an interpretation layer (e.g., depth or segmentation) of the scene. Within each hyperedge, one or several input nodes predict the layer at the output node. Thus, each node could be an input node in some hyperedges and an output node in others. In this way, multiple paths can reach the same node, to form ensembles from which we obtain robust pseudolabels, which allow self-supervised learning in the hypergraph. We test different ensemble models and different types of hyperedges and show superior performance to other multi-task graph…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques