TL;DR
LinkNet introduces an efficient neural network architecture for semantic segmentation that achieves high accuracy with significantly fewer parameters and computational resources, enabling real-time scene understanding.
Contribution
The paper presents a novel deep neural network architecture that maintains high performance while reducing parameter count and computational complexity for semantic segmentation.
Findings
Uses only 11.5 million parameters and 21.2 GFLOPs per image
Achieves state-of-the-art performance on CamVid dataset
Provides faster processing times on GPU and embedded systems
Abstract
Pixel-wise semantic segmentation for visual scene understanding not only needs to be accurate, but also efficient in order to find any use in real-time application. Existing algorithms even though are accurate but they do not focus on utilizing the parameters of neural network efficiently. As a result they are huge in terms of parameters and number of operations; hence slow too. In this paper, we propose a novel deep neural network architecture which allows it to learn without any significant increase in number of parameters. Our network uses only 11.5 million parameters and 21.2 GFLOPs for processing an image of resolution 3x640x360. It gives state-of-the-art performance on CamVid and comparable results on Cityscapes dataset. We also compare our networks processing time on NVIDIA GPU and embedded system device with existing state-of-the-art architectures for different image resolutions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
