Rethinking Encoder-Decoder Flow Through Shared Structures

Frederik Laboyrie; Mehmet Kerim Yucel; Albert Saa-Garriga

arXiv:2501.14535·cs.CV·January 27, 2025

Rethinking Encoder-Decoder Flow Through Shared Structures

Frederik Laboyrie, Mehmet Kerim Yucel, Albert Saa-Garriga

PDF

Open Access

TL;DR

This paper proposes shared structures called banks to enhance decoder performance in dense prediction tasks, leading to improved depth estimation in transformer-based architectures on various datasets.

Contribution

It introduces banks as shared structures for decoders, providing additional context and improving performance in depth estimation tasks.

Findings

01

Banks improve depth estimation accuracy.

02

Shared structures enhance decoder context understanding.

03

Performance gains observed on natural and synthetic images.

Abstract

Dense prediction tasks have enjoyed a growing complexity of encoder architectures, decoders, however, have remained largely the same. They rely on individual blocks decoding intermediate feature maps sequentially. We introduce banks, shared structures that are used by each decoding block to provide additional context in the decoding process. These structures, through applying them via resampling and feature fusion, improve performance on depth estimation for state-of-the-art transformer-based architectures on natural and synthetic images whilst training on large-scale datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCinema History and Criticism