Investigations on the inference optimization techniques and their impact on multiple hardware platforms for Semantic Segmentation
Sethu Hareesh Kolluru

TL;DR
This paper evaluates various inference optimization techniques for semantic segmentation models across different hardware platforms, focusing on reducing inference time and energy consumption in self-driving applications.
Contribution
It provides a comprehensive analysis of optimization techniques and their effects on inference speed and energy efficiency on multiple hardware platforms.
Findings
TensorFlow and TensorRT optimization techniques significantly reduce inference time.
Porting networks to embedded platforms like Nvidia Jetson TX1 impacts inference speed and energy use.
Optimization effects vary across hardware platforms, influencing deployment choices.
Abstract
In this work, the task of pixel-wise semantic segmentation in the context of self-driving with a goal to reduce the inference time is explored. Fully Convolutional Network (FCN-8s, FCN-16s, and FCN-32s) with a VGG16 encoder architecture and skip connections is trained and validated on the Cityscapes dataset. Numerical investigations are carried out for several inference optimization techniques built into TensorFlow and TensorRT to quantify their impact on the inference time and network size. Finally, the trained network is ported on to an embedded platform (Nvidia Jetson TX1) and the inference time, as well as the total energy consumed for inference across hardware platforms, are compared.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · CCD and CMOS Imaging Sensors
