See far with TPNET: a Tile Processor and a CNN Symbiosis
Andrey Filippov, Oleg Dzhimiev

TL;DR
TPNET is a novel neural network architecture inspired by human visual perception, combining a Tile Processor with CNNs to enhance 3D perception accuracy and efficiency in high-resolution stereo imaging.
Contribution
Introducing TPNET, a hybrid system that offloads image correction to a Tile Processor, enabling more accurate disparity prediction with reduced network complexity.
Findings
Disparity prediction from TPNET is twice as accurate as hand-crafted algorithms.
Tile Processor reduces input feature dimensions and provides invariant data for real-time high-res stereo perception.
TPNET enables efficient 3D perception with less dependence on end-to-end training.
Abstract
Throughout the evolution of the neural networks more specialized cells were added to the set of basic building blocks. These cells aim to improve training convergence, increase the overall performance, and reduce the number of required labels, all while preserving the expressive power of the universal network. Inspired by the partitioning of the human visual perception system between the eyes and the cerebral cortex, we present TPNET, which offloads universal and application-specific CNN from the bulk processing of the high resolution pixel data and performs the translation-variant image correction while delegating all non-linear decision making to the network. In this work, we explore application of TPNET to 3D perception with a narrow-baseline (0.0001-0.0025) quad stereo camera and prove that a trained network provides a disparity prediction from the 2D phase correlation output by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · CCD and CMOS Imaging Sensors · Advanced Image and Video Retrieval Techniques
