Convolutional Neural Networks (CNN) are on the forefront of accurate character recognition. This paper explores CNNs at their maximum capacity by implementing the use of large datasets. We show a near-perfect performance by using a dataset of about 820,000 real samples of isolated handwritten digits, much larger than the conventional MNIST database. In addition, we report a near-perfect performance on the recognition of machine-printed digits and multi-font digital born digits. Also, in order to progress toward a universal OCR, we propose methods of combining the datasets into one classifier. This paper reveals the effects of combining the datasets prior to training and the effects of transfer learning during training. The results of the proposed methods also show an almost perfect accuracy suggesting the ability of the network to generalize all forms of text.
|Title of host publication
|Proceedings - 2016 15th International Conference on Frontiers in Handwriting Recognition, ICFHR 2016
|Institute of Electrical and Electronics Engineers Inc.
|Number of pages
|Published - Jul 2 2016
|15th International Conference on Frontiers in Handwriting Recognition, ICFHR 2016 - Shenzhen, China
Duration: Oct 23 2016 → Oct 26 2016
|Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR
|15th International Conference on Frontiers in Handwriting Recognition, ICFHR 2016
|10/23/16 → 10/26/16
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Computer Vision and Pattern Recognition