# TW-SMNet: Deep Multitask Learning of Tele-Wide Stereo Matching

**Authors:** Mostafa El-Khamy, Haoyu Ren, Xianzhi Du, and Jungwon Lee

arXiv: 1906.04463 · 2020-10-20

## TL;DR

This paper presents a deep multitask neural network for estimating real-world depth from stereo images with wide and telephoto views, improving depth estimation across combined fields of view.

## Contribution

Introduces the TW-SM problem and proposes a novel multitask deep learning model that jointly estimates disparity and inverse depth for tele-wide stereo images.

## Key findings

- Effective depth estimation on KITTI and SceneFlow datasets.
- Demonstrates synthesis of Bokeh effect from stereo pairs.
- Shows improved depth accuracy over baseline methods.

## Abstract

In this paper, we introduce the problem of estimating the real world depth of elements in a scene captured by two cameras with different field of views, where the first field of view (FOV) is a Wide FOV (WFOV) captured by a wide angle lens, and the second FOV is contained in the first FOV and is captured by a tele zoom lens. We refer to the problem of estimating the inverse depth for the union of FOVs, while leveraging the stereo information in the overlapping FOV, as Tele-Wide Stereo Matching (TW-SM). We propose different deep learning solutions to the TW-SM problem. Since the disparity is proportional to the inverse depth, we train stereo matching disparity estimation (SMDE) networks to estimate the disparity for the union WFOV. We further propose an end-to-end deep multitask tele-wide stereo matching neural network (MT-TW-SMNet), which simultaneously learns the SMDE task for the overlapped Tele FOV and the single image inverse depth estimation (SIDE) task for the WFOV. Moreover, we design multiple methods for the fusion of the SMDE and SIDE networks. We evaluate the performance of TW-SM on the popular KITTI and SceneFlow stereo datasets, and demonstrate its practicality by synthesizing the Bokeh effect on the WFOV from a tele-wide stereo image pair.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.04463/full.md

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/1906.04463/full.md

## References

64 references — full list in the complete paper: https://tomesphere.com/paper/1906.04463/full.md

---
Source: https://tomesphere.com/paper/1906.04463