UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN ?>»ch it was witw" " to the ""rary f r " ** ""Iplln.ry oct ^ ""* *** of books ^ *• University and ""V r esolf ,„ *j. °? re «ons To renew ««,, Te.eoh " P ' fr ° m u N'VB«s,rror^: Cen,er ' 333 - 8 ^ %6-liied Report No. UIUCDCS-R-74-660 //UM 7 ON-LINE CHARACTER RECOGNITION by Alfred C. Weaver August 197^4- DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS Report No. UIUCDCS-R-7^- 660 ON-LINE CHARACTER RECOGNITION by Alfred C. Weaver August, 197^ Department of Computer Science University of Illinois at Urban a -Champaign Urbana, Illinois 6l801 Digitized by the Internet Archive in 2013 http://archive.org/details/onlinecharacterr660weav TABLE OF CONTENTS Page 1. THE EVOLUTION OF ONLINE CHARACTER RECOGNITION 1 1.1 Introduction 1 1.2 History 1 1.3 Essential Characteristics of Recognizers .... 2 1.3.1 Hardware 2 1.3.2. Software 3 1.3.3 Feature Recognition ..... 4 l.k Advantages 6 1.5 Disadvantages 6 1.6 Summary 7 2. THE RECOGNITION ALGORITHM 8 2.1 The ONLINE Simulation Program 8 2.1.1 Theory 8 2.1.2 Program Input 9 2.1.3 Stroke Encoding 9 2.1.1+ The Dictionary 11 2.1.5 Simulator Operation 12 2.1.5.1 Training Mode Detail 13 2.1.5.2 Recognition Mode Detail .... 13 LIST OF REFERENCES 1*+ 11 APPENDIX Page 1. A POSSIBLE STROKE SEQUENCE FOR THE ALPHANUMERIC CHARACTERS 15 2. SOURCE CODE FOR 'ONLINE' SIMULATOR PROGRAM .... 19 3 . SAMPLE OUTPUT FROM ' ONLINE ' 2k k. SAMPLE OUTPUT FROM "ONLINE' 28 Ill LIST OF FIGURES Figure Page 1. A Voltage Gradient Tablet 2 2. Stroke Sequences Encoded by Regions . . . . h 3. Listing of 4-bit Code Words 10 1 1. THE EVOLUTION OF ON-LINE CHARACTER RECOGNITION 1.1 Introduction The impetus for this project involving machine recognition of hand- printed characters was my dissatisfaction with the current state of on- line character recognition machinery and algorithms, as described in Chapter 12 of Principles of Interactive Computer Graphics [k] , by Newman and Sproull. The set of criteria for the "perfect recognizer," as sug- gested by Newman and Sproull, is: (1) responds quickly; (2) high rate of success of recognition; (3) tolerates variation in size, style, and orientation; (k) uses computer resources sparingly. The authors point out that no current (1973) recognizers can claim to meet all these criteria perfectly. I believe that the hardware and software which I will propose will meet the above criteria as closely as any on-line character recognition machinery I have seen, while simultaneously maintaining a modest cost. 1.2 History The first on-line character recognizers used graphic tablets and mini -computers as the hardware, and had very inflexible sortware (the user) could not train the machine to recognize "his" character, but rather the machine trained the user to adopt a predefined style). An example of just such a system is discussed by Groner [2], The first trainable recognizer was developed by Teitelman [5]j others, including Bernstein [1], soon developed trainable programs utilizing the RAND tablet. While progress has not altogether stopped on such systems, the number of references to recent work (after 1970) is surprisingly and disappointingly small. 1.3 Essential Characteristics of Recognizers 1.3.1 Hardware To recognize the alphabetic, numeric, and special characters of, say, the ASCII code, as well as other special characters demanded by any par- ticular system, a graphic tablet is essential. Previous work with RAND and Sylvania tablets indicates that a large portion of the hardware budget goes into the tablet itself. My proposal is to substitute a voltage gradient tablet of the form described by Newman and Sproull [h]: partially conductive tablet surface -0 -W- ss + (7) yf -f W volt, -direction voltage source x-direction voltage source A Voltage Gradient Tablet Figure 1 The advantages of this choice are primarily its lower cost, secon- darily its inherent ability to be fine-tuned in the field, and thirdly the small amount of additional (and expensive) hardware which must be added to the basic tablet to realize a useful input device. In addition to the voltage sources shown in Figure 1, the system needs simple circuitry to recognize an interrupt from the stylus (making or breaking contact with the tablet surface), an A/D converter, an inter- rupt flip-flop, and two 10-bit X and Y position registers. The necessary computing can be handled entirely by a microprocessor of the Intel 8008 variety (197^- prices are approximately $100 for the chip alone, or $900 for the MCS-8, which includes the 8008 chip, all other necessary circuitry and 3K memory). I chose this particular microcomputer because I was familiar with its capabilities and know it could easily be adapted for this system. 1.3.2 Software The basic requirements of the software system are : (1) read the tablet; (2) extract important features from character; (3) dictionary lookup routine; (h) training routine to build dictionary. All of the above are presently in my simulator program ONLINE. I am convinced that the entire program will fit with room to spare on an MCS-8. 1.3*3 Feature Recognition The crucial part in the design of the recognizer is the choice of features to he extracted from the input strokes. A good recognizer must discriminate between different characters while recognizing slightly dif- ferent versions of the same character. The most popular features extracted by current mechanisms are the regions visited by each stroke. By predefining a standard rectangle, or by normalizing the stroke to some standard rectangle, and then dividing that rectangle into a number of regions, a program may then count and record the number of times a stroke crosses a region boundary. This is exactly the scheme used by both Bernstein [1] and Teitelman [5], as shown in Figure 2. stroke stroke Stroke Sequences Encoded by Regions Figure 2 After collecting a stroke sequence, these schemes searched a tree- structured dictionary, yielding the character to be recognized. Ledeen [3] developed a simple recognizer which used a similar feature extraction mech- anism, but utilized a more compact dictionary (about IK l6-bit words). My objections to these known schemes are three-fold: (1) The use of RAED-type tablets is too costly; (2) I disagree in theory with having to write into a predefined rec- tangle, using predefined region boundaries. The alternative, "clipping" to the boundaries of the character itself and then sub- dividing into regions , only increases the amount and cost of pro- cessing power and memory needed, and hence is equally unsatis- factory. (3) The tree-structured dictionary uses more memory than necessary. If the dictionary is actually a tree, with nodes and pointers, ap- proximately 2/3 of the memory space is devoted to pointers and only 1/3 to data. If the dictionary is a binary tree, but struc- tured as an array [left son of node n at T(2n) and right son at T(2n+1)], then the total array area must grow in proportion to the depth of the tree, i.e., the longest stroke sequence. Since adding one level to the tree will double the array area required, effec- tive memory utilization remains a problem. My solutions to the three objections above are: (1) Use a resistive tablet. This relatively low cost input device contains all the sophistication necessary for character recogni- tion; (2) Make the recognition process independent of the size and location of the character drawn. This is accomplished by a new definition of the "essential features" of characters: the order of strokes and their direction, independent of their physical location; (3) Encode the stroke sequence and orientation into one code number (explained in Algorithm ) and look up the code number in a sorted dictionary. The code number has the same information content as the tree traversal order, without the memory overhead. The whole scheme is made viable by using the classical approach of having an initial training period in which the user trains the machine to recognize his own writing style (in fact, the user may teach several stroke sequences for the same character). Following the training period the machine runs in its normal recognition mode. Provision is made for retraining the machine when another user takes over. l.k Advantages The worth of the solution described here is enhanced by its direct applicability to a microprocessor environment, its sparing use of costly memory, and the small amount of interface hardware necessary between the tablet and processor. It allows the user to train the machine to his own writing style. Different users may operate the machine by utilizing a retraining mode. Furthermore, the entire algorithm is programmable using only addition, subtraction, and testing. Wo multiplications or divisions are required. 1.5 Disadvantages It was first thought that the requirement of distinct strokes would allow recognition of only block-style printed characters. However, the scheme generalized quite nicely to the normal spectrum of printed characters, with only a very few unnatural cases caused by describing two different characters by the same stroke pattern. 1.6 Summary I believe that the "Weaver method" represents an elegant solution to an otherwise sticky problem. Like its predecessors, this method fails to fully satisfy all of Newman and Sproull's basic criteria for the per- fect recognizer. Yet, it satisfies sufficiently many of them that it would perform well in many common situations. Its most significant advan- tages are its low cost and ease of implementation. Appendix 1 contains a list of the alphanumeric characters and their stroke sequence definitions for the author's style of writing. Appendix 2 is a listing of the computer program which simulates the hardware and software systems. Appendix 3 is a sample run which shows the training and recognition of all 36 alphanumeric characters. Appendix k illus- strates the change of mode, error detection, and symbol non-recognition. 2. THE RECOGNITION ALGORITHM 2.1 The ONLINE Simulation Program The program ONLINE simulates the actions of: (1) A resistive tablet and stylus; (2) The hardware interface between tablet and processor, including X- and Y-position 10-bit registers and 1-bit interrupt flag; (3) The processor (a microprocessor, such as the Intel 8008) . This simple equipment is shown to be sufficient for an on-line character recognition system. 2.1.1 Theory The touching of the stylus to the tablet generates an interrupt and sets an interrupt flag. Upon seeing this flag the processor copies the contents of the position registers into its variables X-, and Y . The lifting of the stylus from the tablet also generates an interrupt and sets an interrupt flag. This time the processor copies the posi- tion registers into its variables X p and Y_. Now by simple subtrac- tion and testing, the processor determines a four-bit code word for each line (X, ,Y )_> (X ,Y ) drawn. The code is of the form C,C C C. , where C, is 1 only if the line was drawn from top to bottom C is 1 only if the line was drawn from bottom to top, C-. is 1 only if the line was drawn from left to right, and CV is 1 only if the line was drawn from right to left. 9 The code word is set by a program segment similar to this: CI = C2 = C3 = Ck = 0; if (Y1-Y2) > EPSILON then CI = 1; else if (Y2-Y1) > EPSILON then C2 = 1; if (X2-X1) > EPSILON then C3 = 1; else if (X1-X2) > EPSILON then (& = lj 2.1.2 Program Input The input to the simulator program for any one character is a series of "strokes" which represent the hardware result of the previous algor- ithm. For instance, the letter "A" is defined by three strokes: stroke number stroke orientation stroke encoding 1 top to bottom & right to left TB,RL/ 2 top to bottom & left to right TB,LR/ 3 left to right LR/ Each stroke is encoded by using the obvious 2-character mnemonic for each of the four possible directions, with a "/" used to indicate the end of one stroke. Thus the full description of "A" is: A: TB,RL/TB,LR/LR/ 2.1.3 Stroke Encoding From this description, provided as a character string on an input card to the simulator, the processor decodes the input into sequence of k- bit code words, one code word per stroke. Now, by using a binary digit weighting scheme on the 4-bit words, a unique hexadecimal (0-15) digit results. 10 Note that, of the 16 possible codes, only 8 are used (combinations like top-to-bottom and bottom-to-top are clearly impossible). The usable combinations are shown in Figure 3» C 1 (TB) C (BT) C (LR) C, (RL) Hexadecimal Equivalent 1 1 1 2 1 k 1 1 ,5. 1 1 6 1 8 1 1 9 1 1 10 Listing of h- bit Code Words Figure 3- Thus any sequence of strokes is reducible to a series of hex digits, "A" is encoded as 9/10/2/. Since 10 is the largest hex digit used, ' subtracting one from each of the hex equivalents in Figure 3 provides a corresponding set of deci- mal digits which are more easily handled. The decimal digits corres- ponding to the stroke sequence are then weighted with a power of ten, proportional to their positions in the stroke sequence, to yield a single decimal number as the representative code number. Repeating the example of the example of the character "A", 11 A TB,RL/TB,LR/LR/ 9/10/2 8/9/1 891 character simulator input hex decimal code number Clearly, this process may be run backwards to generate the defining stroke sequence which generated any given code number. The worst case for the length of a stroke sequence for common al- phanumeric characters appears to be k strokes. Even allowing for 5 5 strokes, the corresponding code number is less than 10 , and so remains well within the range of microcomputer arithmetic. While the above scheme works quite well, generalizing from block characters to printed characters required one minor modification - recognition of the "null" stroke generated by curves which begin and end within EPSILON of the same point, as in the letter "0", the number "0", the letter "Q", and the numbers "6", "8", and "9". This was easily ac- complished by allowing a previously unused decimal digit (6) to mark a "null" stroke. 2.1.4 The Dictionary Now that the input stroke sequence has been encoded into a few bits, it appears to be best managed by storing the decimal code numbers and the characters which they represent in a dictionary, formed from two linear arrays, and arranged in sorted order by ascending code numbers. The be- ginning of a typical dictionary might be: 12 071 c Til F 891 A ♦ • • A binary search on the code numbers suffices to quickly locate any character in the list. In recognition mode this is exactly what happens; in training mode the addition of a new code-number-and- character combination uses the binary search to find their proper posi- tion in the list. The list is then shifted to make a hole, and the new information is inserted. The efficiency of the binary search is well known; for a list of, say, 6h characters, the maximum number of probes required to locate any character is given by log 6k = 6. 2.1.5 Simulator Operation The simulator works in one of five modes : Training: A character is presented and its stroke sequence defined. Example: A: TB,RL/TB,LR/LR/ Z: LR/TB,RL/LR/ Recognition: Only the stroke sequence is presented; it is decoded and its associated character printed. Example : TB/TB/LR/LR/ Restart: Clears all currently stored information in preparation for a new training sequence. 13 Ignore: In the event of invalid commands, they are ignored. Stop: Program terminates. 2.1.5.1 Training Mode Detail The first character of the input card contains the new character to be learned. The remaining characters are the stroke sequence mnemonics which the user wishes to associate with this character. The stroke sequence is decoded into its code number n, the current (sorted) list of code numbers is binary searched to find this number's proper location, all entries whose code number is larger than n are moved down, and the current code number and character are inserted. 2.1.5.2 Recognition Mode Detail Each input card contains only the simulated stroke sequence. The sequence is decoded into its corresponding code number and the current dictionary is searched. If the code number is found, its associated character is echoed along with the message STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "*", where * is the recognized character. If the code number is not found, the message "STROKE SEQUENCE NOT RECOGNIZED. TRY AGAIN" appears. If, when the stroke sequence is repeated, it is still not recognized, the message "STILL NOT RECOGNIZED. RETRAIN FOR THIS SYMBOL" is produced. It is probable that the user has now either changed his defining stroke sequence for a particular character, or used a new symbol unfamiliar to the recognizer. In either case, the recognizer should return to training mode to learn a new stroke sequence; this is speedily accomplished with the $TRAIN command. Ik LIST OF REFERENCES 1. Bernstein, M.I., and T. G. Williams, "A Two-Dimensional Programming System," Proceedings 1968 IFIP Congress, North Holland Pub. Co., page 586, 1969. 2. Groner, G. F. , "Real-Time Recognition of Hand Printed Text", FJCC 1966, Spartan Books, page 591. 3. Ledeen, K. S., "The Ledeen Character Recognizer," Principles of Interactive Computer Graphics , appendix VIII, 1973. k. Newman, W. M. , and R. F. Sproull, Principles of Interactive Computer Graphics , McGraw-Hill Book Co., 1973 • 5. Teitelman, W. , "Real Time Recognition of Hand-Drawn Characters," FJCC 196^-, Spartan Books, page 559* 15 APPENDIX 1 The alphanumeric characters and their defining stroke sequences, as determined by the author's style of writing. 16 A B D E G H K A POSSIBLE STROKE SEQUENCE FOR THE ALPHANUMERIC CHARACTERS M CD -P M ra O U U •H -P 891 I * P 777 $ 7 I 2 77 I -+ _h 7111 1 > 711 c — P 91 I — => 1 717 — p I 171 J -> 81 I / >» 789 u 9 / \ / \ 8989 N Q R T U W / H a a) •H CD rQ CD o 3 T9T 6 IT 69 39 8 71 2 98 9898 89 987 181 68 78 97 188 first stroke second stroke •H -P a) o -p CO 5 o 0) o w decima; code number k ^ 4 87 5 J ^ — r» 781 6 i (5 76 7 — 5> / 18 8 <3 = •$• THEN /* SET PROPER MODE */ IF CARD = 'STRAIN 1 THEN MODE = 1; ELSE IF CARD = 'SRECOGNIZE* THEN MODE = 2; ELSE IF CARD = f $STOP f THEN MODE = 3; ELSE IF CARD = •SRESTART* THEM CALL RESTART; ELSE MODE =0; /* OTHERWISE, CARD IS AN INPUT STROKE SEQUENCE */ ELSE DO; /* DECODE THE STROKE SEQUENCE */ CALL DECODE; /* FIND CGDE WORD IN TABLE */ CALL LOOKUP; /* WHICH MODE ARE WE IN? */ IF MODE - 1 THEN /* IN TRAINING MODE */ CALL INSERT; ELSE /* IN RECOGNITION MODE */ DC; IF FOUND /* ALREADY IN DICTIONARY */ THEN PUT EDIT (•STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "•, RCHAR(PDINTER), »•••) ( 3 A); ELSE /* NOT IN DICTIONARY */ DO; IF NOTFOUND = 1 /* FIRST TIME */ THEN PUT EDIT (•CHARACTER NOT RECCGNIZED. TRY AGAIN 1 ) (SKIP, A); ELSE /* SECOND TIME NOT FOUND */ PUT EDIT (•STILL NOT RECOGNIZED, RETRAIN FOR THIS SYMBOL 1 ) (SKIP, A) ; END; END; END; END; PUT SKIP(3) EDIT ( f END OF PROGRAM') (A); RETURN; RESTART: PROCEDURE; /* A PROCEDURE TO CLEAR THE DATA AREA */ DECLARE I FIXED BINARY (31); POINTER, *ENT, CODE*, MODE, NOTFCUND = DO I = 1 TO 64; LIST(I) * o; RCHAR(I) ' • •; END; 22 END RESTART; LOOKUP: PROCEDURE; /* A PROCEDURE TO FIND THE LOCATION OF •CODE*' IN 'LIST 1 */ DECLARE (HEAD INIT(O), TAIL, MID, *2 INITC2I) FIXED BINARY (31); TAIL = *ENT + 1; FOUND = 'O'B; /* STANDARD BINARY SEARCH */ DO WHILE (TAIL-HEAD > 1); MID = (HEAD+TAIL)/*2; IF CODE* < LIST(MID) THEN TAIL = MID; ELSE IF CODE* > LIST(MID) THEN HEAD = MID? ELSE DO; FOUND = ■ 1»B; NOTFCUND = 0; POINTER = MID; RETURN; END; END; POINTER = HEAD; NOTFOUND = NOTFOUND * i; END LOCKUP; INSERT: PROCEDURE; /* A PROCEDURE TO INSERT •CODE* 1 AND »NEWCHAR» INTO DICTIONARY */ DECLARE I FIXED BINARY (31); IF FOUND THEN RETURN; POINTER = POINTER «- 1; DO I = *ENT TO POINTER BY -1; LISTU + li = LIST(I); RCHARU + 1) = RCHARUI; END; LIST(POINTER) = CODE*; RCHAR(POINTER) = NEWCHAR; *ENT = *ENT + 1; END INSERT; DECODE: PROCEDURE; /* A PROCEDURE TO READ STROKE SEQUENCE AND PRODUCE 'CODE*' */ DECLARE FIELD CHAR(IO) VAR, (I,C) FIXED BINARY(3i); /* IF IN TRAINING MODE, EXTRACT NEW CHARACTER */ IF MODE = 1 THEN DO; NEWCHAR = SUBSTR(CARD, 1,1); CARD = SUBSTR(CARD,3); end; code* = o; /* REPEAT FOR EACH STROKE */ DO WHILE (CARD -= • •); I = INDEX(CARD, •/•); /* CATCH INPUT FORMAT ERRORS */ IF I = THEN DO; 23 /* STOP ON INPUT FORMAT ERROR */ PUT SKIP EDIT ('INPUT FORMAT ERROR') (A); STOP; END; ELSE DO; FIELD = SUBSTRCCARD, 1, II; CARD = SUBSTRtCARD, I +1 J ; C = 0; IF INDEX(FIELD, «TBM>0 THEN C=C + 8; IF INDEX(FIELD, «BT«)>0 THEN C=04; IF INDEXtFIELD, «LR«)>0 THEN C = C«-2; IF INDEX(FIELD, •RL , )>0 THEN C=C+1; IF C = THEN C = 7; /* IN THE MICROPROCESSOR CODE THIS MULTIPLICATION MAY BE REPLACED BY ADDITIONS */ CODE* = CODE* * 10 *■ C - 1; END; END; END DECODE; END ONLINE; 2k APPENDIX 3 Sample output from ONLINE showing training and recognition for the 36 aphanumeric characters. 25 STRAIN A: TB,RL/TB,LR/LR/ 6: TB/TB/TB/ C: TB/ D: TB/TB/ E: T8/LR/LR/LP/ F: TB/LR/LR/ G: TB,LR/LR/ H: TB/LR/TB/ I: LP/TB/LR/ J: TP,RL/LR/ K: TB/TB, RL/TB,LR/ L: TB,LR/ MS TB,PL/TB,LP/TB,RL/TB,LR/ N: TB/TB, LR/TB/ 0: / P: BT/ Q: /TB,LR/ R: BT/TR,LR/ S: TB,RL/ T: TB/LR/ U: LR/ V: TB, LR/TB, PL/ w: TB, LR/TB, RL/T B, LR/TB, RL/ X: TB,RL/TB,LP/ Y: TB,LR/T3,P'L/TB/ Z: LP/TB,RL/LP/ 0: /TB,RL/ l: TB/TB, RL/ 26 2: TB,LR/T6/ 3: LR/TB,RL/TR,PL/ 4: TB.RL.TB/ 5: TB/TR,RL/LR/ 6: TR// 7: LR/TB,RL/ 8: // 9: /TB/ SRECOGNI ZE TB,RL/TB,I_R/LR/ TB/TB/TB/ TB/ TB/TB/ TB/LR/LR/LR/ TB/LP/LR/ TB,LR/LP/ TB/LR/TB/ LR/TB/LP/ TB,PL/LR/ TB/TR,RL/TB,LR/ TB,LR/ TB,RL/TB,LR/TS,RL/TB,I_R/ TB/TB, LR/TB/ / BT/ /TB,LR/ BT/TB,LP/ TB.RL/ TB/LR/ LR/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER M A M STROKE SEQUENCE RECOGNIZED AS THE CHARACTER H B H STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "C" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "D" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER M E M STROKE SEQUENCE RECOGNIZED AS THE CHARACTER «'F M STROKE SEQUENCE RECOGNIZED AS THE CHARACTER M G" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "H M STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "I" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER W J" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER M K» STROKE SEQUENCE RECOGNIZED AS THE CHARACTER «L" STRCKE SEQUENCE RECOGNIZED AS THE CHARACTER M M W S T ROKE SEQUENCE RECOGNIZED AS THE CHARACTER »N" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "0* STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "P H STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "Q" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "R" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "S" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER W T" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "U" 27 TB,LR/TB,RL/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "V" TB,LR/TB,RL/TB, LR/TB,RL/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "W" TB,PL/TB t LP/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "X" TB,LR/TB,RL/TB/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "V" LR/TBfRL/LR/ STROKE SEGUENCE RECOGNIZED AS THE CHARACTER "Z" /TB,RL/ STROKE SEQUENCE RECCGNIZED AS THE CHARACTER "0" TB/TB,RL/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "1" TB,LR/TB/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "2" LR/TB,RL/TB,PL/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "3" TB,PL.TB/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "S" TB/TB,RL/LR/ STROKE SEQUfcNCE RECOGNIZED AS THE CHARACTER W 5 M TB// STROKE SEQUENCE RECOGNIZED AS THE CHARACTER M 6" LR/TB,RL/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER " 7" // STRCKE SEQUENCE RECOGNIZED AS THE CHARACTER "8 M /TB/ STROKE SEQUENCE RECCGNIZED AS THE CHARACTER "9" $STOP END OF PROGRAM 28 APPENDIX k Sample output from ONLINE showing change of mode, error detection, and symbol non-recognition. 29 SRECOGNIZE TB/BT/T8/ CHARACTER NOT RECOGNIZED. TRY AGAIN STRAIN A: TB,RL/T8,LR/LR/ B: TP/TB/TB/ C: TB/ D: TB/T8/ E: TB/LR/LR/LP/ F: TB/LR/LR/ SPECCGNIZE TB/LF/LP/LR/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER M E" TB,PL/TB,LP/LR/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "A" TB/TB/LR/LP/ CHARACTER NCT RECCGNIZED. TRY AGAIN TB/TB/LR/LR/ STILL NOT RECCGNIZED. RETRAIN FOR THIS SYMBOL STRAIN *: T8/TB/LP/LF/ SPECCGNI ZE TB/TP/LR/LR/ STROKE SEQUENCE RECOGNIZED AS THE CHARACTER »#" SPESTAPT STRAIN 0: /TBtPL/ l: TB/TR,RL/ 2: TB.LP/TP/ 3: LP/TR,RL/TB,RL/ <*: TR.RL/TR/ 5: TR/TB,RL/LR/ 6: TB// 7: LR/TB,RL/ 30 8: // 9: /tp/ SRECOGNI ZE T8tLF/TB/ TB// /TB/ $BADCCMMAN0 SBADCCMMAND SREC0GNI7E TB,LF,TB f RL INPUT FORMAT ERROR STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "2" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER M 6" STROKE SEQUENCE RECOGNIZED AS THE CHARACTER "9" BIBLIOGRAPHIC DATA SHEET 1. Report No. UIUCDCS-R-7^-660 3. Recipient's Accession No. 4. I i( le and Subt it le On-line Character Recognition 5. Report Date August 197^ 7. A ut hor(s ) Weaver, A.C. 8- Performing Organization Rept. N °- UIUCDCS-R-7^-660 9. Performing Organization Name and Address Department of Computer Science University of Illinois at Urbana -Champaign Urbana, Illinois 6l801 10. Project/Task/Work Unit No 11. Contract /Grant No. 1. miiMiring Organization Name and Address Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 6l801 13. Type of Report & Peri Covered 14. Ic mi'ill ary Notes 16. \hstracts This report discusses the current state-of-the-art in computer recognition of handwritten characters. Several current schemes are examined and criticized. A new recognition method is developed utilizing a voltage gradient tablet for input and clever software for essential feature extraction. A simulation program is included as an appendix. [17 K * irds and l)