Abstract:
Automatic location and recognition are the new trends in today's education industry. However, the dataset based on the scene of English text-line written by school students. Works always focus on text detector' improvement other than a faster detection training period. This article introduces a data synthesis method for synthesizing a dataset that can be used to locate composition text-lines from scanned answer sheet images with few labeled data. The synthetical handwritten text-line dataset includes 5k composite images and more than 2.5k coordinate annotations. This method can also make CTPN text-line detecting network trained from scratch. Besides, Handwriting Recognition (HWR) is a key to revising large-batch English composition on answer sheet. However, handwriting feature is very different from the scene text feature, which challenges the traditional recognition. Hence, this article introduces MLC-CRNN, a refined handwritten text-line recognizer based on CRNN, which can improve recognition accuracy. The proposed method focuses on depth of network and MLC module respectively, and shows can both contribute to handwritten text-line recognition.