ASE2022
Accelerating OCR-Based Widget Localization for Test Automation of GUI Applications
Ju Qian, Yingwei Ma, Chenghao Lin, Lin Chen
6 citations
Abstract
Optical character recognition (OCR) algorithms often run slow. They may take several seconds to recognize the texts on a GUI screen, which makes OCR-based widget localization in test automation unfriendly for use, especially on GPU-free computers. This paper first concludes a common type of widget text to be located in GUI testing: label text, which are short texts in widgets like buttons, menu items, and window titles. We then investigate the characteristics of texts on a GUI screen and introduce a fast GPU-independent Label Text Screening (LTS) technique to accelerate the OCR process for label text localization. The technique opens the black box of OCR engines and uses a combination of simple methods to avoid excessive text analysis on a screen as much as possible. Experiments show that, on the subject datasets, LTS reduces the average OCR-based label text localization time to a large extent. On 4k resolution GUI screens, it keeps the localization time below 0.5 seconds for over about 60% of cases without GPU support on a normal laptop computer. In contrast, the existing CPU-based approaches built on popular OCR engines Tesseract, PaddleOCR, and EasyOCR usually need over 2 seconds to achieve the same goal on the same platform. Even with GPU acceleration, they can hardly keep the analysis time in 1 second. We believe the proposed approach would be helpful for implementing OCR-based test automation tools.