# Lightweight Monocular Depth Estimation Model by Joint End-to-End Filter   pruning

**Authors:** Sara Elkerdawy, Hong Zhang, Nilanjan Ray

arXiv: 1905.05212 · 2020-05-19

## TL;DR

This paper introduces a novel joint end-to-end filter pruning method for monocular depth estimation models, significantly reducing model size while maintaining accuracy, enabling deployment on low-end devices.

## Contribution

It presents a new filter pruning approach that learns binary masks for filters, exploiting inter-filter relations to create lightweight models from large pre-trained networks.

## Key findings

- Achieves around 5x compression with minimal accuracy loss on KITTI dataset.
- Masking can improve baseline accuracy with fewer parameters.
- Method enables deployment on resource-constrained devices.

## Abstract

Convolutional neural networks (CNNs) have emerged as the state-of-the-art in multiple vision tasks including depth estimation. However, memory and computing power requirements remain as challenges to be tackled in these models. Monocular depth estimation has significant use in robotics and virtual reality that requires deployment on low-end devices. Training a small model from scratch results in a significant drop in accuracy and it does not benefit from pre-trained large models. Motivated by the literature of model pruning, we propose a lightweight monocular depth model obtained from a large trained model. This is achieved by removing the least important features with a novel joint end-to-end filter pruning. We propose to learn a binary mask for each filter to decide whether to drop the filter or not. These masks are trained jointly to exploit relations between filters at different layers as well as redundancy within the same layer. We show that we can achieve around 5x compression rate with small drop in accuracy on the KITTI driving dataset. We also show that masking can improve accuracy over the baseline with fewer parameters, even without enforcing compression loss.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.05212/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1905.05212/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1905.05212/full.md

---
Source: https://tomesphere.com/paper/1905.05212