Skip to main content

103. Segmentation-Free Bangla/Korean Handwriting Recognition Using Neural Network Training

Steven Kim, Dr. Elisa H. Barney Smith, Nishatul Majid

Kim & Majid Final Poster - view poster content on post page.
Select to view full poster image

Introduction

  • The purpose of this research is to recognize handwriting. We implemented the method for Bangla/Korean. This research work has been submitted to the International Conference of Frontier in Handwriting Recognition (ICFHR) 2020.
  • Image recognition is often used for transcription, where storing data as a file is much more reliable than data written in a tangible object.
  • In order to achieve high accuracy and productivity, we implemented a neural network named VGG-16, which provides Faster R-CNN and transfer learning.
  • The Bangla network uses two individual networks in order to detect the character and the diacritics.
  • The Korean network uses one network to detect the Hangul characters.
A flowchart explainng how a computer detects Bangla characters and their diacritics
Figure 1. Example of Bangla Network Detection
flowchart
Figure 2. Example of Korean Network Detection

Image Detection Tools

Faster R-CNN

  • Detects images of different sizes of boxes as figures 1 and 2.
  • An efficient method for detecting objects that are overlapped with another object, which is often found within different types of handwriting.

VGG-16

  • A pre-trained network which was converted into a character detecting network.
  • Efficiently creates a network instead of building the network from scratch.
  • Has 16 layers of convolutional network, developed by the Oxford research team VGG.

PE92 Korean Handwriting Database

Our Team used the PE92 Database that contains various compositions of handwritten Korean Characters. This was collected by POSTECH, funded by ETRI in 1992.
The Dataset contains 2350 classes with about 100 samples per class.

  • Korean handwriting recognition was known as a difficult problem due to the existence of various compositions and patterns
  • So far, researcher In-Jung Kim applied a convolutional neural network and achieved 92.92% accuracy using PE92.
Screenshot of korean character datapack contents
Figure 3. Example classes and sample

Manual/Automated Tagging

In order to teach a network how to recognize different objects, a process of tagging is required to draw a bounding box around each character and label its class. Covering every Korean composition is crucial, since we want the machine to recognize every possible case; Overall, we have manually tagged:

  • 133 classes of different compositions
  • 1468 samples

In the meantime, Dr. Barney Smith created an automated tagging framework. This framework uses existing tagged groundtruth in order to produce more groundtruth, which allowed us to obtain 133 classes with about 100 samples each.

Computer detection of Korean characters
Figure 4. Example of manual tagging and automated tagging process

K-Net

Using the manual/automated tagging, we have created a network called “K-Net”. K-Net detects each individual characters and comes up with a compound of letter. In this research, we took into account of the individual and the compound detection accuracy, to make sure that the network is creating the combination in correct order.

Computer detecint Korean characters with the K-Net detection system
Figure 5. Example of K-Net detection

Training /Testing Environment (R2 Cluster)

Using VGG-16 implies the fact that a cheap laptop without an external graphics card may take quite some time to finish the job. However, thanks to R2 Computer Cluster provided by the Research Computing Department, we were able to efficiently divide computational power by simply uploading the network and running it on the server.

Since we used a subset of the whole database, we were able to finish this process quickly, but with some loss of accuracy. Using the autonomous tagging, we got results of JRA (Jamo Recognition Accuracy) of 91.22%, and SRA (Syllable Recognition Accuracy) of 84.66%.

Table III: Korean Rcognition Results on PE92 Dataset

Researchers Methods JRASRA
Park et al. [2]MQDFN.A.85.99%
Kim et al. [3]DCNNN.A.92.92%
Presented Approach (Using a Subset of PE92)Character Spotting (Autonomous Tagging)91.22%84.66%
Presented Approach (Using a Subset of PE92) (Continued)Character Spotting (Manual Tagging) 86.22%79.23%
Figure 6. Result comparison data from article Autonomous Data Tagging for Offline Handwriting Recognition: Tested with Bangla and Korean Scripts written by Nishatul Majid and Elisa H Barney Smith

Conclusion

This project is a derivative work from Nishatul Majid’s framework of Bangla Offline Handwriting Recognition. This is a new direction of approach for him.

Most languages write side to side, but Korean forms individual letters using different compositions then write from left to right.

Our maximum recognition accuracy was 84.66%. We haven’t done any post processing or finished implementing the whole dataset yet, which opens up more possibilities of this framework may be able to get a higher accuracy.

References

  1. N. Majid and E. H. Barney Smith, “Segmentation-Free Bangla Offline Handwriting Recognition using Sequential Detection of Characters and Diacritics with a Faster R-CNN” in International Conference on Document Analysis and Recognition (ICDAR), September 2019.
  2. Park, Gyu-Ro, In-Jung Kim, and Cheng-Lin Liu. “An evaluation of statistical methods in handwritten hangul recognition.” International Journal on Document Analysis and Recognition (IJDAR) 16.3 (2013): 273-283.
  3. Kim, In-Jung, and Xiaohui Xie. “Handwritten Hangul recognition using deep convolutional neural networks.” International Journal on Document Analysis and Recognition (IJDAR) 18.1 (2015): 1-13.

Additional Information

For questions or comments about this research, contact Steven Kim at stevenkim@u.boisestate.edu.