TL;DR
This paper introduces a lightweight, real-time eye segmentation model using depthwise convolutions, achieving high accuracy on a large dataset with minimal computational resources.
Contribution
A novel multi-class eye segmentation method employing an encoder-decoder network with depthwise convolutions for efficient, real-time performance on limited hardware.
Findings
Achieved 94.85% mIoU on OpenEDS dataset.
Model size is only 0.4 MB, enabling deployment on hardware with limited resources.
Demonstrated real-time inference capability with high accuracy.
Abstract
In this paper, we present a multi-class eye segmentation method that can run the hardware limitations for real-time inference. Our approach includes three major stages: get a grayscale image from the input, segment three distinct eye region with a deep network, and remove incorrect areas with heuristic filters. Our model based on the encoder decoder structure with the key is the depthwise convolution operation to reduce the computation cost. We experiment on OpenEDS, a large scale dataset of eye images captured by a head-mounted display with two synchronized eye facing cameras. We achieved the mean intersection over union (mIoU) of 94.85% with a model of size 0.4 megabytes. The source code are available https://github.com/th2l/Eye_VR_Segmentation
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDepthwise Convolution · Convolution
