Skip to content
Tech News

Chinese Handwriting Recognition Hangups

By

Reading time 2 minutes

While most of you responded to my earlier query about Asian handwriting recognition to say that, often, it’s even easier than Romanized character recognition, reader James Yopp explains how sometimes Asian character sets can present a unique set of problems.

After the jump, that is.

HWR in Chinese is a completely different beast. Unlike in English, where we expect to be able to write our characters in any visually-recognizable way and have it recognized properly, character recognition in Chinese, Japanese, and other sino-derived languages depend very heavily on the order, direction, and count of the strokes used to compose the character. Most people don’t bother, and just use romanization to type and input text to their PC’s and cellphones. For Japanese, everyone (except a few government workers) uses romanization to input data to a PC. In Chinese, phonetic characters are often used on the keyboard (a rough equivalent to the Japanese Kana input), but romanization via pinyin readings is just as popular.

The simplest HWR example I can give you is the character for “mouth”, a square box (an amazing number of characters either include it or are derived from it, so it’s the most pertinent example as well.). It is drawn first with a downward stroke on the left, then with a stroke that traces the top and right of the box, then finally the bottom line.

In the attached image, the first image is how the character *should* be drawn. The second image, although it is totally wrong, would be recognized as the same character by most handwriting engines. The fourth symbol would be misinterpreted as another character entirely, even though it would be legible on paper.

Share this story

Sign up for our newsletters

Subscribe and interact with our community, get up to date with our customised Newsletters and much more.