Tải bản đầy đủ (.pdf) (13 trang)

large pattern recognition system using multi neural networks - codeproject

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.28 MB, 13 trang )



Articles » General Programming » Algorithms & Recipes » Neural Networks
Large pattern recognition system using multi neural
networks
By Vietdungiitb, 31 May 2012
Download drawing samples - 52 KB
Download handwriting_recognition_using_multi_neural_networks.flv - 8.9 MB
Download source - 1.3 MB
Download demo - 146.6 KB
Download lower_case_letter_v2.zip - 5.6 MB
Download digit_v2.zip - 3.3 MB
Download capital_letter_v2.zip - 5.6 MB
Introduction

4.98 (36 votes)
Now a day, artificial neural network has been applied popularly in many fields of human life. However, creating an
efficient network for a large classifier like handwriting recognition systems is still a big challenge to scientists. In my last
article named “Library for online handwriting recognition system using UNIPEN database”, I presented an efficient
library for a handwriting recognition system which can create, change a neural network simply. The demo program
showed good recognition results to digit set (97%) and alphabet sets (93%).This article I will continue to present a
solution for a large patterns classification in general and handwriting recognition in particular.

Recognition rate significantly increate when using additional spell checker module
Neural network for a recognition system
In the traditional model of pattern recognition, a hand-designed feature extractor gathers relevant information from
input and eliminates irrelevant variabilities. A trainer classifier (normally, a standard, fully-connected multi-layer neural
network can be used as a classifier) then categorizes the resulting feature vectors into classes. However, it could have
some problems which should influent to the recognition results. The convolution neural network (CNN) solves this
shortcoming of traditional one to achieve the best performance on pattern recognition task.
The CNNs is a special form of multi-layer neural network. Like other networks, CNNs are trained by back propagation


algorithms. The difference is inside their architecture. The convolutional network combines three architectural ideas to
ensure some degree of shift, scale, and distortion invariance: local receptive field, shared weights (or weight replication)
spatial or temporal sub-sampling. They have been designed especially to recognize patterns directly from digital
images with the minimum of pre-processing operations. The architecture details of CNN have been described
comprehensively in articles of Dr. Yahn LeCun and Dr. Patrice Simard (see my previous articles).

Figure 1: The Architecture of LeNET 5
Figure 2: An input image followed by a feature map performing a 5 × 5 convolution and a 2 x 2 sub-sampling
map
The recognition results of the above networks are really high to small patterns collection such as digit, capital letters or
lower case letters etc. However, when we want to create a larger neural network which can recognize a bigger
collection like digit and English letters (62 characters) for example, the problems begin appear. Finding an optimized
and large enough network becomes more difficult, training network by large input patterns takes much longer time.
Convergent speech of the network is slower and especially, the accuracy rate is significant decrease because bigger bad
written characters, many similar and confusable characters etc. Furthermore, assuming we can create a good enough
network which can recognize accurately English characters but it certainly cannot recognize properly a special character
outsize its outputs set (a Russian or Chinese character) because it does not have expansion capacity. Therefore,
creating a unique network for very large patterns classifier is very difficult and may be impossible.
The proposed solution to the above problems is instead of using a unique big network we can use multi smaller
networks which have very high recognition rate to these own output sets. Beside the official output sets (digit,
letters…) these networks have an additional unknown output (unknown character). It means that if the input pattern is
not recognized as a character of official outputs it will be understand as an unknown character. Then the input pattern
will be transferred to the next network until the system can recognize it correctly.

Figure 3: Convolution neural network with unknown output

Figure 4: Recognition System using multi neural networks
This solution overcomes almost limits of the traditional model. The new system includes a several small networks which
are simple for optimizing to get the best recognition results. Training these small networks takes less time than a huge
network. Especially, the new model is really flexible and expandable. Depending on the requirement we can load one or

more networks; we can also add new networks to the system to recognize new patterns without change or rebuilt the
model. All these small networks have reusable capacity to an other multi neural networks system.
Experiment
The demo program is built to the purpose showing all stages of a recognition system including: create a component
network, train a network, test networks on UNIPEN dataset and test networks on a mouse drawing control. It is
tutorials which can help everybody can understand to a recognition system. All functions can be implemented on the
program GUI. So you can create, train, and test your network on runtime without change any code or restart the
program.

Figure 5: Handwriting recognition system interface
Creating new neural network
Figure 6: Creating new neural network Interface
Creating new neural network completely bases on GUI. Creating a network depends on the input pattern size, number
of layers, data set…. On the output layer we can choose unknown output checkbox to create an additional unknown
output to the network or ignore it to create a normal network.
Of course, we can still to create a network by code:
void CreateNetwork()
<pre> {
network = new ConvolutionNetwork();
//layer 0: inputlayer
network.Layers = new Layer[6];
network.LayerCount = 6;
InputLayer inputlayer = new InputLayer("00-Layer Input", new Size(29, 29));
network.InputDesignedPatternSize = new Size(29, 29);
inputlayer.Initialize();
network.Layers[0] = inputlayer;
ConvolutionLayer convlayer = new ConvolutionLayer("01-Layer ConvolutionalSubsampling",
inputlayer, new Size(13, 13), 10, 5);
convlayer.Initialize();
network.Layers[1] = convlayer;

convlayer = new ConvolutionLayer("02-Layer ConvolutionalSubsampling", convlayer, new
Size(5, 5), 60, 5);
convlayer.Initialize();
network.Layers[2] = convlayer;
FullConnectedLayer fulllayer = new FullConnectedLayer("03-Layer FullConnected",
convlayer, 200);
fulllayer.Initialize();
network.Layers[3] = fulllayer;
fulllayer = new FullConnectedLayer("04-Layer FullConnected", fulllayer, 100);
fulllayer.Initialize();
network.Layers[4] = fulllayer;
OutputLayer outputlayer = new OutputLayer("05-Layer Output", fulllayer, Letters3.Count,
true);
outputlayer.Initialize();
network.Layers[5] = outputlayer;
network.TagetOutputs = Letters3;
network.UnknownOuput = '?';
}
Training a network
After creating a neural network using "Create network" function, the network will be trained using UNIPEN database.

Figure 7: Training network interface
Depending on the network size we can choose training set is 1a, 1b or 1c in the UNIPENdata folder. Statistic of
training process can show many useful information such as: No. of epoch, MSE, training time per epoch, success rate…
UNIPEN data browser and recognition testing
The UNIPEN data browser control in the demo program can show all the UNIPEN data files. We can also test the
trained neural network on these files by loading trained network parameters files.

Figure 8: UNIPEN data browser and recognition interface
Mouse Drawing test


Figure 9: Mouse drawing recognition interface
The mouse drawing control is based on the excellent article ”DrawTools” by Alex Fr. I just changed some codes to fit to
my requirement. The cursive text in the image is divided to line, word and isolated character by same algorithm as
follows:
private void btRecognition_Click(object sender, EventArgs e)
<pre> {
//recognition all characters in the drawArea
if (bitmap != null)
{
bitmap.Dispose();
bitmap = null;
}
bitmap = new Bitmap(drawArea.Width, drawArea.Height);
drawArea.DrawToBitmap(bitmap, new Rectangle(0, 0, bitmap.Width, bitmap.Height));
drawBitmap =(Bitmap) bitmap.Clone();
if (bitmap != null)
{
lbRecognizedText.Items.Clear();
List<InputPattern> lineList=null;
List<InputPattern> wordList=null;
InputPattern parentPt=new InputPattern(bitmap,255,new
Rectangle(0,0,bitmap.Width,bitmap.Height));
lineList = GetPatternsFromBitmap(parentPt,500,1,true,10,10);
if (lineList.Count > 0)
{

if (characterList != null)
{
characterList.Clear();

characterList = null;
}
characterList = new List<InputPattern>();
foreach (var line in lineList)
{
String text = "";
wordList = GetPatternsFromBitmap(line, 50, 10,false, 10, 10);
if (wordList != null)
{
if (wordList.Count > 0)
{
foreach (var word in wordList)
{
List<InputPattern> charList = GetPatternsFromBitmap(word, 5, 5,
false, 10, 10);
//check if have part bitmaps
if (charList != null)
{
if (charList.Count > 0)
{
panelNavigation.Visible = true;
foreach (var c in charList)
{
characterList.Add(c);
c.GetPatternBoundaries(5,5,false,10,10);
Char accChar = new Char();
PatternRecognition(c.OriginalBmp,out accChar);
if (accChar != '\0')
{
text = String.Format("{0}{1}", text,

accChar.ToString());
drawBitmap =
c.DrawChildPatternBoundaries(drawBitmap);
}
}
}
}
text = String.Format("{0} ", text);
}

}
}
lbRecognizedText.Items.Add(text);
}
}
pbPreview.Image = drawBitmap;
lblNavigation.Text = characterList.Count.ToString();
index = 0;
}

}

Figure 10: Loading trained network parameters files
In order to active the recognition function I simply load trained network parameters files. Depending to my recognition
requirement I can load one, two or all files. The recognition results are really good (higher 90%) if I load only one
network to recognize its output characters. However, when I load multi network the system’s accuracy rate becomes
lower. The main reasons are many confusable characters in cursive text; the training sets are not large enough etc.
For a large pattern collection like handwritten characters, there are so many similar characters which can make not
only machine but also human confuse in some cases such as: O, 0 and o; 9, 4,g,q etc. These characters can make
networks misrecognize. Hence the solution has been being upgraded which significant increate recognition rate by

using an additional spellchecker/voting module at the output of system. The input pattern will be recognized by all
component networks. These outputs (except unknown outputs) then will be set as the inputs of the
spellchecker/voting module. The module will bases on previous recognized characters, internal dictionary and other
factors to decide which one will be the most accurated recognized character.
Figure 11: The new recognition system using Spell checker /voting module
The new recognition system using Spell checker /voting module (internal dictionary)
The spellchecker module makes the system recognizes much better
Conclusion
The proposed recognition model has solved amost prolems to a large recognition system: the capacity of recognizing
large partern collection, flexible design and deployment, expanable and resuable capacity etc. Increasing accuracy rate
to the system also can do easier by increasing recognition rate of component networks, using the spell checker /voting
module etc. The demo program also proved the capacity of the library which should be used in many other
applications such as prediction application, face recognition
Fututre work and upgrade
Some features would be udate to the library:
- Convolution and sampling layer of LeNET model.
- Spell checker / voting module
-character segmentation.
At the moment, the project took to much my free time. It should be slowdown or temporary stop until I can re-
arrange everything and/or find a new good sponsorship. Howerver the vote/comment to the article would decice the
project will continue or not. I will really appreciate to receive comments and suggessions to the article especially to the
model, spell checker module and character segmentation algorithm
History
version 1.0: initial code
version 1.1 the spell checker /voting module has been added to the system which increates significantly recognition
rate. It made me really supprised and happied. I will publish it when I complete code rearrangement.
Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.121031.1 | Last Updated 1 Jun 2012
Article Copyright 2012 by Vietdungiitb
Everything else Copyright © CodeProject, 1999-2012

Terms of Use
License
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)
About the Author
Vietdungiitb
Vietnam Maritime University
Vietnam
Member
No Biography provided
Comments and Discussions
30 messages have been posted for this article Visit />pattern-recognition-system-using-multi-neura to post and view comments on this article, or click here to get a
print view with messages.

×