ScreenSeg: On-Device Screenshot Layout Analysis
Manoj Goyal, Rachit S Munjal, Sukumar Moharana, Deepak Garg, Debi, Prasanna Mohanty, Siva Prasad Thota

TL;DR
ScreenSeg introduces an on-device, hierarchical layout analysis method for screenshots that accurately segments entities like text, images, and icons, enabling smart editing and various content-related applications on resource-limited mobile devices.
Contribution
It presents a novel end-to-end on-device layout analysis approach with a new weighted NMS technique, optimized for complex screenshots and resource-constrained devices.
Findings
Achieves 0.95 average precision on 1080p screenshots
Operates with approximately 200ms latency on Samsung Galaxy S10
Supports a wide variety of semantically complex screenshots
Abstract
We propose a novel end-to-end solution that performs a Hierarchical Layout Analysis of screenshots and document images on resource constrained devices like mobilephones. Our approach segments entities like Grid, Image, Text and Icon blocks occurring in a screenshot. We provide an option for smart editing by auto highlighting these entities for saving or sharing. Further this multi-level layout analysis of screenshots has many use cases including content extraction, keyword-based image search, style transfer, etc. We have addressed the limitations of known baseline approaches, supported a wide variety of semantically complex screenshots, and developed an approach which is highly optimized for on-device deployment. In addition, we present a novel weighted NMS technique for filtering object proposals. We achieve an average precision of about 0.95 with a latency of around 200ms on Samsung…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
