UberNet: Training a `Universal' Convolutional Neural Network for Low-,   Mid-, and High-Level Vision using Diverse Datasets and Limited Memory

Iasonas Kokkinos

arXiv:1609.02132·cs.CV·September 8, 2016

UberNet: Training a `Universal' Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory

Iasonas Kokkinos

PDF

1 Repo

TL;DR

UberNet is a unified CNN architecture capable of performing multiple low-, mid-, and high-level vision tasks simultaneously, trained efficiently on diverse datasets with limited memory, achieving competitive results in real-time.

Contribution

This work introduces UberNet, a novel end-to-end trainable CNN that handles a wide range of vision tasks within a single model, addressing training on diverse datasets and memory constraints.

Findings

01

Handles multiple vision tasks simultaneously

02

Achieves real-time performance (~0.7 seconds per frame)

03

Maintains competitive accuracy across tasks

Abstract

In this work we introduce a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture that is trained end-to-end. Such a universal network can act like a `swiss knife' for vision tasks; we call this architecture an UberNet to indicate its overarching nature. We address two main technical challenges that emerge when broadening up the range of tasks handled by a single CNN: (i) training a deep architecture while relying on diverse training sets and (ii) training many (potentially unlimited) tasks with a limited memory budget. Properly addressing these two problems allows us to train accurate predictors for a host of tasks, without compromising accuracy. Through these advances we train in an end-to-end manner a CNN that simultaneously addresses (a) boundary detection (b) normal estimation (c) saliency estimation (d)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

EPFL-VILAB/XDEnsembles
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.