About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICDAR 2019
Conference paper
Simultaneous optimisation of image quality improvement and text content extraction from scanned documents
Abstract
Convolutional neural networks are shown to achieve breakthrough performance for the task of single image super resolution (SISR) for natural images. These state-of-the-art (SOA) networks have been adapted to the task of single text image super resolution and have been shown to boost the optical character recognition (OCR) performance. However, these approaches depend on variations of the standard mean squared error (MSE) loss in order to train the SR network for improving the text image quality which does not guarantee optimal OCR performance. In this paper, we propose to combine the OCR performance into the loss function during network training. This results in the generation of high resolution text images that achieve high OCR performance that is comparable to the ground truth high-resolution text images and surpassing those of the SOA baseline results. We define novel intuitive metrics to capture the improvement in the OCR performance and provide extensive experiments to qualitatively and quantitatively assess improvement in the results of our proposed approach against the SOA baselines on the standard UNLV dataset.