Rethinking Encoder-Decoder Flow Through Shared Structures
Frederik Laboyrie, Mehmet Kerim Yucel, Albert Saa-Garriga

TL;DR
This paper proposes shared structures called banks to enhance decoder performance in dense prediction tasks, leading to improved depth estimation in transformer-based architectures on various datasets.
Contribution
It introduces banks as shared structures for decoders, providing additional context and improving performance in depth estimation tasks.
Findings
Banks improve depth estimation accuracy.
Shared structures enhance decoder context understanding.
Performance gains observed on natural and synthetic images.
Abstract
Dense prediction tasks have enjoyed a growing complexity of encoder architectures, decoders, however, have remained largely the same. They rely on individual blocks decoding intermediate feature maps sequentially. We introduce banks, shared structures that are used by each decoding block to provide additional context in the decoding process. These structures, through applying them via resampling and feature fusion, improve performance on depth estimation for state-of-the-art transformer-based architectures on natural and synthetic images whilst training on large-scale datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCinema History and Criticism
