Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Computing > Ai Neural-nets > Suitability of ...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 3 Topic 3868 of 3978
Post > Topic >>

Suitability of ANN + preprocessing algorithm for OCR

by Brendon Costa <brendon@[EMAIL PROTECTED] > Jun 3, 2008 at 06:54 PM

Hi,

I am writing some software to capture text from a video stream of a PC
game. I am looking into ANN's for the recognition of individual
glyphs. I have been using a more naive bitmap matching approach up
until now, but it is not scaling very well in terms of maintaining the
bitmap matching patterns for all variations of the font and especially
with anti-aliasing in the text.

The text i am trying to recognize all seems to be made from a few
variations of the one font:

* Bold or Normal
* Anti-Aliased or not
* Currently i have seen 4 different sizes

There is one other complication that occurs too:

* Sometimes glyphs overlap by 1 pixel, i.e. the text string "eve" has
no discernible gaps between the 3 glyphs and the two top points of the
'v' character on both the left and right sides are ALSO part of the
'e' characters.

There are a few advantages though. Mainly that unlike scanned
do***ents, this original images wont be "missing" any pixels though
may gain a few pixels and i also know at exactly what pixel in the y
dimension that line starts.

I have no experience with ANN's (except what i have read from the
Internet in the last few days), so i thought after reading a little on
the web i would ask people with some more experience in ANN's if what
i intend is feasible.

I will describe how i intend to prepare the data for the ANN and what
i expect of its classification and then will ask my questions at the
end.

I was going to pre-process the data using the following method before
passing to the ANN.

* Obtain a glyph by looking for horizontal 'gaps' (this may sometimes
return a sequence of multiple glyphs like: "eve" for example instead
of individual glyphs).

* Resize the glyph based on its height to force it to a height of 15
pixels (Assumes same ratio calculated for scaling height is applied to
width as well). Uses linear interpolation. I.e. If glyph is 30x50
pixels would scale to 15x25 pixels.

* If the glyph is larger than the "max width of all glyphs" (this is a
pre-determined constant value) then truncate the glyph to have a width
of max_width. For example if the glyph contains all pixels for "eve"
and not just 'e', this might truncate that to all the pixels for the
glyph 'e' and half of the pixels for the glyph 'v' which i hope i can
train the ANN to ignore...

* If the glyph was smaller than max_width then pad it with blank
pixels to the right to make its with equal to max_width.


At this point i would have a glyph of dimensions: max_width x 15
pixels. I consider this a "normalized" glyph that may contain extra
"junk" following the actual glyph character.

* Feed the values from this grid directly into the ANN (Would i need
to process this a bit in order to reduce the number of inputs?)

* ANN categorizes the glyph into a character (a-zA-Z0-9 and a few
symbols, maybe 1 boolean value as well for bold/normal indication).


I don't know if this is possible but from what little i have read, i
would assume this ANN might have maybe around 300 (multiple intensity)
inputs and 200 discrete outputs. Not sure how many hidden nodes it
would need or what structure to give it etc.

I.e. Assuming a 20x15 grid of greyscale pixels, where each pixel can
have a value from 0 - 255. This would be provided as the input to the
ANN.



-- Questions --

Can ANN inputs accept multi-level inputs or only binary inputs?

The output i would expect a set of discrete classifications. Maybe 8
binary output lines(1 byte). Where the top bit would be 1 for bold and
0 for normal, and the lower 7 bits would be the ASCII character for
the categorized glyph.

If it fails to recognize a glyph, can you train a ANN to have a
"default" case?

Is all this feasible or is there some pointer people can give on a
better way of achieving this?

Also one last question, where can i find out what the computational
complexity is of using simulated ANN's to perform the categorization
(Not training as i assume i have done that "manually" before hand)? If
they are a LOT slower than my current bitmap comparison technique then
i will probably have to give them a miss.

Thanks for any information,
Brendon.
 




 3 Posts in Topic:
Suitability of ANN + preprocessing algorithm for OCR
Brendon Costa <brendon  2008-06-03 18:54:05 
Re: Suitability of ANN + preprocessing algorithm for OCR
"CWhizard" <  2008-06-06 03:02:45 
Re: Suitability of ANN + preprocessing algorithm for OCR
Brendon Costa <brendon  2008-06-13 16:27:23 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Sat Aug 30 2:37:30 CDT 2008.