Investigations on the inference optimization techniques and their impact   on multiple hardware platforms for Semantic Segmentation

Sethu Hareesh Kolluru

arXiv:1911.12993·cs.CV·December 2, 2019

Investigations on the inference optimization techniques and their impact on multiple hardware platforms for Semantic Segmentation

Sethu Hareesh Kolluru

PDF

Open Access

TL;DR

This paper evaluates various inference optimization techniques for semantic segmentation models across different hardware platforms, focusing on reducing inference time and energy consumption in self-driving applications.

Contribution

It provides a comprehensive analysis of optimization techniques and their effects on inference speed and energy efficiency on multiple hardware platforms.

Findings

01

TensorFlow and TensorRT optimization techniques significantly reduce inference time.

02

Porting networks to embedded platforms like Nvidia Jetson TX1 impacts inference speed and energy use.

03

Optimization effects vary across hardware platforms, influencing deployment choices.

Abstract

In this work, the task of pixel-wise semantic segmentation in the context of self-driving with a goal to reduce the inference time is explored. Fully Convolutional Network (FCN-8s, FCN-16s, and FCN-32s) with a VGG16 encoder architecture and skip connections is trained and validated on the Cityscapes dataset. Numerical investigations are carried out for several inference optimization techniques built into TensorFlow and TensorRT to quantify their impact on the inference time and network size. Finally, the trained network is ported on to an embedded platform (Nvidia Jetson TX1) and the inference time, as well as the total energy consumed for inference across hardware platforms, are compared.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · CCD and CMOS Imaging Sensors