Resource Aware Scene Text Recognition Using Learned Features, Quantization, and Contour Based Charac

Resource Aware Scene Text Recognition Using Learned Features, Quantization, and Contour Based Charac

Abstract:

Scene texts serve as valuable information for humans and autonomous systems to make informed decisions. Processing scene texts poses significant difficulties for computer systems due to several factors, primarily due to variations in image characteristics. These factors make it very challenging for computer systems to accurately detect and interpret scene texts, despite being easily understandable to humans. To address this problem, scene text detection and recognition methods leverage computer vision and/or deep learning methods. Deep learning methods require substantial resources, including computing power, memory, and energy. As such, their use in real-time embedded applications, particularly those that run on integer-only hardware, is very challenging due to the resource-intensive nature of these methods. In this paper, we developed an approach to address this challenge and to showcase its effectiveness, we trained end-to-end models for shipping container number detection and recognition. By doing so, we were able to demonstrate the accuracy and reliability of our proposed method for processing scene texts on integer-only hardware. Our efforts to optimize the models yielded impressive results. We reduced the model size by a factor of 3.8x without significantly affecting the models’ performance. Moreover, the optimized models were 1.6x faster, and the maximum RAM usage was 6.6x lower than the base models. These results demonstrate the efficiency and practicality of our approach for scene text processing on integer-only embedded hardware.