Self-supervised Hypergraphs for Learning Multiple World Interpretations
Alina Marcu, Mihai Pirvu, Dragos Costea, Emanuela Haller, Emil, Slusanschi, Ahmed Nabil Belbachir, Rahul Sukthankar, Marius Leordeanu

TL;DR
This paper introduces a self-supervised hypergraph approach for learning multiple scene representations and improving pretrained vision transformers using minimal labeled data, demonstrating superior performance on a new UAV dataset.
Contribution
The paper proposes a novel multi-task hypergraph framework for scene interpretation and enhances pretrained models without additional labels, introducing the Dronescapes dataset.
Findings
Hypergraph-based multi-task learning outperforms existing models.
Ensemble methods improve robustness and pseudolabel quality.
The approach effectively leverages minimal labeled data for complex scene understanding.
Abstract
We present a method for learning multiple scene representations given a small labeled set, by exploiting the relationships between such representations in the form of a multi-task hypergraph. We also show how we can use the hypergraph to improve a powerful pretrained VisTransformer model without any additional labeled data. In our hypergraph, each node is an interpretation layer (e.g., depth or segmentation) of the scene. Within each hyperedge, one or several input nodes predict the layer at the output node. Thus, each node could be an input node in some hyperedges and an output node in others. In this way, multiple paths can reach the same node, to form ensembles from which we obtain robust pseudolabels, which allow self-supervised learning in the hypergraph. We test different ensemble models and different types of hyperedges and show superior performance to other multi-task graph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
