Object Detection for Graphical User Interface: Old Fashioned or Deep   Learning or a Combination?

Jieshan Chen; Mulong Xie; Zhenchang Xing; Chunyang Chen; Xiwei Xu,; Liming Zhu; Guoqiang Li

arXiv:2008.05132·cs.CV·September 8, 2020

Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?

Jieshan Chen, Mulong Xie, Zhenchang Xing, Chunyang Chen, Xiwei Xu,, Liming Zhu, Guoqiang Li

PDF

2 Repos

TL;DR

This paper conducts a large-scale empirical study comparing traditional image processing, deep learning, and hybrid methods for GUI element detection, leading to a new effective GUI-specific detection approach.

Contribution

It provides the first comprehensive evaluation of GUI element detection methods and introduces a novel hybrid approach that outperforms existing techniques.

Findings

01

Traditional CV methods lack GUI-specific awareness.

02

Deep learning models improve detection but have limitations.

03

The proposed hybrid method achieves state-of-the-art performance.

Abstract

Detecting Graphical User Interface (GUI) elements in GUI images is a domain-specific object detection task. It supports many software engineering tasks, such as GUI animation and testing, GUI search and code generation. Existing studies for GUI element detection directly borrow the mature methods from computer vision (CV) domain, including old fashioned ones that rely on traditional image processing features (e.g., canny edge, contours), and deep learning models that learn to detect from large-scale GUI data. Unfortunately, these CV methods are not originally designed with the awareness of the unique characteristics of GUIs and GUI elements and the high localization accuracy of the GUI element detection task. We conduct the first large-scale empirical study of seven representative GUI element detection methods on over 50k GUI images to understand the capabilities, limitations and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.