A novel ensemble deep network framework for scene text recognition

Sunil Kumar Dasari, Shilpa Mehta, Diana Steffi


In recent years, scene text recognition (STR) has always been considered a sequence-to-sequence problem. Attention-based techniques have a greater potential for context-semantic modelling, but they tend to overfit inadequate training data. STR is one of the most important and difficult challenges in image-based sequence recognition. A novel framework ensemble deep network (EDN) is proposed, EDN comprises customized convolutional neural network (CNN), and deep autoencoder. Customized CNN is designed by introducing the optimal spatial transformation module for optimizing the input of irregular text to read for same size. Further, deep autoencoder is introduced with effective attention mechanism utilizing the inherent features. The proposed ensemble deep network-proposed system (EDN-PS) approach outperforms the existing state-of-art techniques for both irregular and regular scene-texts and upon further simulations, the proposed model generates better results for IIIT5K, ICDAR-13, ICDAR-15, and CUTE dataset in comparison with the existing system hence our proposed EDN-PS model outperforms the existing state-of-art methods.


Autoencoder; Customized CNN; EDN-proposed system; Ensemble deep network; Scene text recognition

Full Text:


DOI: http://doi.org/10.11591/ijres.v13.i2.pp403-413


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Reconfigurable and Embedded Systems (IJRES)
p-ISSN 2089-4864, e-ISSN 2722-2608
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

Web Analytics Made Easy - Statcounter View IJRES Stats