LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.84 I£6r no. 782-787 cop. 2 The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN I v RECTI L161 — O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/detailsofexperim782dori t C "• UIUCDCS-R-76-782 V DETAILS OF AN EXPERIMENTAL VIDEOTAPE EVALUATION OF AN INTERACTIVE EXAM SYSTEM by Richard Doring Lawrence R. Whitlock Wilfred J. Hansen ine Library of tne APR 1 2 1977 University of Illinois December 1976 510, ?H Uk. UIUCDCS-R-76-782 DETAILS OF AN EXPERIMENTAL VIDEOTAPE EVALUATION OF AN INTERACTIVE EXAM SYSTEM by Richard Doring Lawrence R. Whitlock Wilfred J. Hansen December 1976 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS 61801 This work supported in part by the National Science Foundation under grant EC41511. TABLE OF CONTENTS Page 1. Introduction 1 2 . Environment 2 3. Experimental Procedure 6 4. Summary of Results 9 5. Design Proverbs 14 5.1. Instructions and preparation 16 5.2. Problem generation and display 17 5.3. Entering answers 19 5.4. Errors and help 21 5.5. End of Problem 23 5.6. Grading 25 Appendices A. The Exams Used in the Experiment 26 Sample Problem Displays 29 Quirks of the Problem Generator/Graders 34 B. Questionnaires 41 Background Sheet 42 Questionnaire on PLATO Exam and on Paper Exam 43 Written Comments 47 C. Experiment Administration 49 Activities During Experiment Session 50 Users Guide for PLATO Exams 51 Time Sheet 52 Log Sheet 52 D. Definitions of Categories of Activities 53 E. Activities Log for Each Subject 55 1. Introduction This report provides details of an experiment conducted Monday, September 22, 1975. An analysis of the data is contained in a companion paper : Hansen, ¥. J., R. Doring, and L. R. Whitlock, " A Videotape Analysis of Student Performance on an Interactive Examination ," UIUCDCS-R- 76-836, Department of Computer Science, University of Illinois, Urbana, Illinois (1976) So it can stand by itself, this report repeats from the other sections 2 and 3, and the major figures in section h. 2 . Environment ' The examination system in question - the Generative Exam System [Whitlock, 1976] - offers many advantages. A wide variety of question schemes are available. Geographic and temporal scheduling problems can be simplified by the network nature of the host system. The system can offer non-passive problems wherein subsequent subproblems reflect prior performance. Each student receives a slightly different form of each question (to encourage honesty). Finally, scores and correct answers are instantly ready for student review, and scores and statistics are instantly ready for the instructor. The exam system, in turn, is but one component of the Automated Computer Science Education System (ACSES) under development by the Computer Science Department at the University of Illinois [Nievergelt , 1976]. ACSES includes over 100 instructional lessons, some twenty of which are in regular use for over 3000 introductory level students each year. These students constitute the population for which the exam system is designed. They are non-major underclassmen from a wide variety of fields . They have undeveloped typing skills, minimal exposure to interactive systems, and little motivation to learn computer programming. ACSES is implemented on the PLATO system for computer aided instruction developed by the Computer-based Education Research Laboratory, also at the University of Illinois [Bitzer, 1973]. The University's PLATO system is currently the most fully used; it has 1000 terminals connected and about ^00 are usually in operation. From a terminal a student can have access to several thousand hours of instruction, several hundred of which have been polished for - and used in - regular courses. • 3 Though the PLATO system has many advantages, it imposes serious constraints on processing power, memory size, disk accesses, and display speed. Each user is limited to a maximum of ten "TIPS" (thousands of instructions per seconds) and good response cannot be expected if a user requires more than three TIPS. This is adequate for simple question- and-answer interaction, but it cannot support sophisticated versions of data base search, program text analysis, or exam grading. Program and data memories are limited, so large programs and data bases are impractical Core limitations could be avoided by retrieval from disk, but each user is restricted to an average of one disk access per minute. In the Generative Exam System, this resource is consumed in reading and updating student records for each switch from one problem to another. PLATO employs a plasma panel terminal. Its 512x512 dot display is high precision but, like a storage tube, cannot support rapid animation. Because it is driven by 1200 baud lines with a maximum of 180 characters per second, the terminal takes ten seconds to display a full page of text. Text may be written in a very rich character set because the terminal provides 128 built-in characters and memory space for another 128 to be defined by the user program. Unfortunately, it can take over ten seconds to transmit the codes to define any substantial number of extra characters. The keyboard is an augmented typewriter design with a number of "function" keys, including NEXT, BACK, DATA, and HELP. Though a lesson designer can specify any response to these keys, they have certain conven- tional uses. HELP usually causes display of some additional explanation. BACK moves the lesson back to the last section of material covered. Shifted-DATA often returns the display to the index page for the lesson. NEXT has two possible meanings: sometimes it terminates an answer and sometimes it simply signals that the student is ready to go on. Usually when the student answers a question and presses NEXT the system responds with "no" or "ok" following the response. This feedback is an important reinforcement during instruction, but must be switched off for exams. The Generative Exam System is structured as a central monitor and a collection of "problem generator/graders" ( PG/G's). The central monitor displays the exam cover page, transfers to individual PG/G's, stores student answers and scores, and generates statistics. Each PG/G displays one or more problems and interacts with the student to accept the answers. The generality of this structure supports an unlimited variety of question types and within each PG/G random generation techniques provide an infinite number of questions. Each subject took two exams, one form A and one form B. Each form used the same four PG/G's: 1) Arithmetic expressions - 12 variables and their values were displayed and the student was asked to evaluate five expressions involving those variables. 2) FORTRAN syntax - This PG/G included a page of instructions, a cover page, and three problem pages. Each problem page displayed a FORTRAN statement with possibly a syntactic error created with an Extra, Missing, or Replaced basic symbol. For the experiment, form A had assignment, ASSIGN, and PRINT statements, while form B had assignment, GOTO, and DO. In each case the answer was to be given by specifying a type of error (E, M, R, or None) and the associated basic symbols. 3) Print with FORMAT - A print statement and a FORMAT were displayed along with a grid for the answers. The student entered a line of output and then specified where in the grid it should go. Form A had three F format items; form B had three I format items. h) DO loop - A DO-loop with a PRINT statement was presented and the student had to specify the values that would be printed. This PG/G '5 graded interactively; each answer was checked when it was entered. Points were immediately deducted for an incorrect answer and a second chance was given. This scheme tried to avoid the problem of propagation of errors. 3. Experiment Design A two by two design was chosen with subjects in each cell taking two forms of an exam - one on paper and one on PLATO. Both forms were generated by the system and copied on a screen copier to generate the paper version. The design assigned exam forms and methods as shown in figure 3.1. Because the two forms were considered essentially equivalent (and turned out to be so), the subjects were split between PLATO-first and paper-first so neither treatment was favored by the subjects' self- expressed typing ability. Having only one set of videotape equipment and one set of proctors the experiment was run in separate sessions of roughly one hour each. Subjects were recruited in an elementary computer science course similar to those for which the exam system will be used (Computer Science 101, an introduction to FORTRAN for engineering students, Fall, 1975). They were offered the inducement of practice for the hour exam they would be taking a week later. For fairness, however, the experiment exams were made available to all students after completion of the experiment. (More recently, the exam system has become a popular way to study for exams.) We decided to tape only four subjects for several reasons: the major anticipated effect was large (i.e., slowness on PLATO), the equipment was expensive, and we needed experience. We had enough volunteers to schedule an additional four subjects who were not videotaped; this proved fortunate, because one of the eight subjects failed to appear at the appointed hour. When students volunteered, they were asked to state their typing ability. Taped Subject SA Untaped Subject Method First PL A and Form Second pa B SB SE pa A PL B SC SF PL B pa A SD SG pa B PL A Figure 3.1. Experiment Design. For this paper, the subjects have been randomly coded as SA,..., SG. "PL" = PLATO. "pa" = paper. Each experiment session began with a practice exam on PLATO, followed by the two exams dictated by the design. The post-test questionnaire elicited reactions to the two methods of giving exams. The video camera was located slightly behind the subject's shoulder so as to be out of sight while still recording the subject's face and hands , and the general appearance of the PLATO screen or paper exam. Due to low resolution, it was not possible to record the details of the work, so an observer was positioned behind the subject to make a manual record including the sequence in which the problems were worked. A clock was positioned beside the terminal so the time was recorded. To avoid time pressure, the clock was turned so the subject could not see its face. 8 Data analysis began by coding each "activity" observed into one of the categories listed in figure 3.2. Brief Title Think Code RT Description Read and Think ii CP Calculate with Paper and Pencil Answer EA Enter Answers Select PS Problem Selection Generate PG Problem Generation (PLATO only) Load LC Load Character Set (PLATO only) Display PP Problem Presentation (PLATO only) Trouble WN What Next (subject confused) Figure 3.2. Activity Categories . For most of this paper, CP — calculate with Paper and Pencil — has been lumped together with "Think." k. Results Data for all seven subjects are presented in Figures 4.1 and 4.2. The former shows both the subjects' backgrounds and their performance on the experiment. One conclusion from this data was that the differences between taped subjects were exacerbated by the experimental situation. This is echoed on the post-test questionnaire results shown in 4.2. Details of the taped subjects' behavior appear in Figures 4.5 and 4.7 (Numbered for the analysis paper. Omitted figures present details of the analysis.) The first section of 4.5 presents total times in each category for each subject. The second section lists the number of activities corresponding to the times above. The third section summarizes the data in several categories. A major conclusion from this data was that the PLATO overhead on an exam need not be more than twenty percent of the time to take the exam on paper . Figure 4.7 shows the time each subject spent in productive work on each problem. One observation is that subjects reviewed much more on PLATO, possibly because of lack of confidence in system reliability. 10 p to U £ u 3 S •H O X ft K W Oh O • a x o o o Jh • Eh ft to < ft ^ h3 < W ft bO C H ■H H S3 En CO 0) P O OJ ft cd X ^ o •H P cd rH cd > 1) CO w I p ,o o 3 CD CO -r> I < pq pq pq o pq < + PQ II +1 < pq pq pq pq < o oo oo o o i^ o OJ rH H OJ A A OJ 0) a> cd OJ Jh %., M Jh m 3 3 3?^ 6 CO T) 10 10 M 05 •H £ O > > > > Tj O C5 O O O O (U H ,H rH rH H S CO A|cO CO CO CO pq pq pq pq o pq < t3 o O o t5 tJ t3 o o o o o O O O ft ft in > . bD . bO C bO to C M C o tJ W • • ft -^ • -rH H • O • co x: w (1) m OJ 0J H CO h3 co CO CO »"3 >-3 <: pq o q w ft o CO CO CO CO CO CO CO o •H p cj «H m •H CO u 0) cd ft ft OJ ft ni ft a.) ft Qj p to e m cd •H X ft w III p rH 1 p Ifl rO o ']) 3 1J ft CO ""3 t— OJ VD On OJ -3- CO H oo en en en cm oo oo oo oo itn on vo -d- -=f OJ OJ 00 H 0O OJ H OJ O rH H ON 0O H H H OJ H OJ 00 o ir\ ltn onvo vo o 00 00 OJ 00 OJ 0O UA u-\_3- t~-ao vd t— On OJ OJ OJ 0O OJ J- _*■ en co co ^t t— J- O H -J" O rH H H rH rH H OJ OJ CO t— ON 0O ON NX) t— _cj- VO LTN OJ H OJ H OJ H rH co oo ao ir\ On co H < < pq pq < pq pq \-3 cd »-3 cd cd i-3 cd ft ft ft p, p< Ph ft OJ «d ^ n ^ bO crj Cd rrj flj % ctJ w o C! E °^ «m • O '"- OJ ft rt U oj 3 O II bO X o cd Qj 10 vA U - ft OJ O > & rfj •> •a TJ ft cd OJ = X P -C <; a) rC P ft bO-H O P •H > = 10 OJ )H > a •> -H o !>»<<-< cd -H P = p •H CO o H rH •H crj cd oj cm •H rrj £ CO p a a -h C D O P OJ •rH 0j Tj P to •H • o Cm CO cd cd C! M Cm p O OJ to OJ O P •H H P p ft M OJ cd Q O H CO o ft = CJ o p II CO T) pq itn p OJ v_^ o P OJ M Jh CO "r-J OJ O QJ rQ > «m to ^ a C CO o II O CJ ft H pq co H d OJ cd OJ tH. Jh Cm OJ 3 QJ o M OJ •> -H >. > gg U crj cd X! o a Cm O | = g •H II P CO CO CO X < OJ OJ • «N rjrf ^ M OJ +J J- o ft to ^3 cd qj 0) ft P h h^ p CO II P bO in to •H -H ctl O ft H > u IA LTN vo LTN •H CO +a r ■o tfl ^1 CO on LTN 00 < 4 CM CO c— CO ON an =te 2 EH -p CO w V. CD ■P O a u a o tJ 0) SJ on OJ on ^ H Ph| •« OJ CO o OJ _4 on OJ OJ CO H CO OJ ON t— OJ t— t— OJ O OJ H H OJ OJ -p d) O rH rQ rH •H 0) CJ co 2 + rri T* 01 cd w o i-l SJ on OJ CO o on OJ on on OJ _4 on ltn o o OJ OJ on * o OJ t— on on * o on -4 on * * ON CO ON H rH H on on vo oo OJ OJ o en * * * * H O H H 4 CO VD ^D H H H on on -d c cd a) d) o .c + rH S>> 0) cr) > H o ft w rJ •H (H « o LfN ON LTN OJ O OJ u vo H O _4 CD CD > > SJ LTN on on rH •H CO VO -4 c_> O S3 H H H ^ + tJ A! o o On VO O C ^l Ol H rH _4 fn -H cm Pm ,C on .4 On H EH H H rH H Si -4 on on On oj On On rH on vo _4 on f-\ rH t - CM o l/N H OJ OJ H LTN o SJ OJ o _4 LTN on .v. vo on o\ CO On a rH .. < *5 1 ■' (- t— on On _4 •-Jl o OJ _4 rH OJ cm OJ OJ CO O on rH rH rH rH rH t- H _4 OJ vo OJ on OJ OJ OJ -^ rH on on H On H CO OJ OJ t— VO oj on OJ OJ On O on -3- ^-. CJ >> 0) c; to C 0) TJ •H o O ^ •H p «H ^-^ «h to W -p p. — SJ OJ O O -d- t- 4 4 VD o o o o H co on _4- on OJ OJ LTN o o o o all o LfN LP\ ON w pj on on OJ en * ' p •-•I l/N -4- t- oo CM OJ OJ OJ m m PQ SJTJ i_7 a) ^ a) a, P, a, ft 1) crj « o o (1) -qns w W CO al .-3 •>: <; « m cd >J P< P-i P< O CO << d erf p< a {si a pj i oo LT\ vo o o H Hi 0\ ON o -p CO >5 M rH ■H -p 3 •P p CO >>> *h *h •H -p Cm CM H LTN H t— On CO ON tO M P u ■3, o o 'p P cu O m p •H 5h ,C O EH CU «H P O O to 3 > to o rP CU CO .p -P t«0 o CU (-4 M * 05 fs CU o o ,p p CO O CO p CU CO a -p •H -P EH CU M • o a Q) rH P ja O •H !h P Ph co CU J3 p C) o 4 01 CU cd • CO 'm T) -P O CU P *H > -H cu o W •H P, a; t> CU rH •H ^ H H cd a; O T< h -P cu o H -P :-- cu o co cd a> u > to +J ■H ■p ^> + 3 o p CJ cu CO g •r^ rrj O •§! H CO 4-> Ph •H M • O • co bfl t— T) rH • P a) -X O O bO CU CU p H. CO -H 3 Ti bO -P cd fc od bO Ik 5. Forty Design Proverbs Our experiences with the exam system and the videotape experiment have led to development of design principles that minimize user difficulty, These principles show some similarity to those in Hansen, Wilfred J., "User Engineering Principles for Interactive Systems," Proceedings Fall Joint Computer Conference V. 39 » AFIPS Press, Mondale, NJ, pp. 523-532. Many of the "proverbs" in subsequent sections are particular instances of a few me ta-proverbs . 01. Be Predictable . Users experience enormous frustration when a system does not behave as they have come to expect it to behave. This proverb is especially hard to follow, and, therefore, especially important in a mult i -component system like the exam system with its many independently written PG/G's. Areas in which the system should behave predictably are similar to those covered in the next proverb. 02 . Follow the Standards . Observance of predefined standards reduces design effort and gives an air of familiarity to the product which reduces the users effort to learn how to control it. For the exam system standards exist for several areas: control key conventions, problem page appearance, the response to an answer entry, and the order of generating displays. If we knew enough, we would even specify standards for problem wording. 03. Provide Reversibility . For each action the user can take there must be an obvious way to undo that action. This provision reduces the pressure on the user by reducing the penalty for typing and procedural errors. For the exam system typical actions are to enter an answer or move to the next problem. The reverse actions are to delete an answer and to move to the previous problem; both must be available at every point in every PG/G. 0U . Keep Instructions Simple . If the procedures cannot be explained simply, they are too complex and must be revised. A surfeit of instructions I reduces the emphasis on each to the point where even important ones are forgotten. 15 Many of the principles relate to the standard control keys so it is necessary to discuss them here. There are five control keys — NEXT, BACK, HELP, DATA, and LAB— and alternates with the shift key— shift-NEXT, shift-BACK, shift-HELP, shift-DATA, shift-LAB. NEXT — Normally has three uses — end of an answer, end of reading a paragraph (add some to page), and end of reading a page (go to next page). The user usually cannot predict the distinction; often the end of an answer will cause a new page. Exam system — End of answer or move the arrow to the next answer on this page. The arrow cycles from the bottom answer back to the top one . shift-NEXT — No usual function. Some Computer Science lessons skip to next topic. Some non-Computer Science lessons jump to the table of contents, Exam system — Go to the next problem page. BACK — Usually takes the user back to the previously displayed page, perhaps for review. Exam system — Move the arrow to the previous answer slot . Cycle top to bottom. shift-BACK — Usually goes to beginning of previous "major" section of the lesson (or back to table of contents from first section) . Exam system — Go to previous problem page . HELP — Normal lessons provide a response to the HELP key whenever the user may have difficulty understanding the material or answering a question. Sometimes HELP invokes a. new display and BACK is used for return. Exam system — HELP should explain restrictions on the answers . It should never invoke a new display. shift-HELP — Often used to provide help on key conventions. At least that is its use in the exam system. DATA, LAB, shift-DATA, shift-LAB — Used for many purposes, few of them mnemonically suggested by the names of the keys. Exam system (and other Computer Science lessons) — shift-DATA transfers the user to the exam cover page. The others are used for various purposes by individual PG/G's, for example, LAB provides access to a calculator in one and LAB/DATA move a pointer in another. 16 5.1. Instructions and Preparation Even a crystal clear design is futile if students are not aware of its facets. These proverbs suggest ways to aid the student in understanding the system. 11. Provide practice exams . These will help students learn the conventions and gain confidence that the system retains answers and computes a grade only on the final answer. Quizes at the ends of lessons should follow similar conventions so the student is at least accustomed to the fact that the system moves each answer after it is entered. (Unfortunately, many of our quizes predate this finding and use other conventions.) 12. Provide written instructions . Prior to the practice and actual exams the student should receive a printed list of instructions for preparation and for reference while working problems. The instructions should emphasize the freedom of, and mechanisms for, moving around in the exam and changing answers. 13. Stress the mechanics of moving from problem to problem . If students are aware of how to move between problems, they can always escape from trouble by going to another problem. The instructions for our experiment (Appendix C) stressed the possibility of moving, but failed to say how. 1^. Stress the mechanics of entering and changing answers . This is less important than proverb 13 because students will enter more answers and will get more practice. Nonetheless, some stress is needed because the conventions differ from instructional lessons. )on't patronize . Assume the student is intelligent even if not informed about the details of some concept of procedure. Do not present disparaging remarks in case of error. Avoid cute remarks. The subsystem through which students enter the exam system greeted the finished student with a gratuitous Congratulations for completing csxmonitor!" This is especially inappropriate if the student "completed" the exam by leaving some problems undone and scoring poorly. l6. The proctor must take the exam and make mistakes . Only with experience will the proctor be able to help students when they get in trouble. At worst the experience should help the procotr know how to get around a previously undiscovered bug in a PG/G. '• should avoid random roaming of the exam room . Students cannot ■ably use the answers from other students, and roaminp; may arouse the :vity of those who prefer not to display their ignorance. IT 5.2. Problem Generation and Display 21. Use standard page layout (see Figure 5-2.1.) . This will provide a common, professional appearance, disguising the polyglot authorship of the PG/G's and providing continuity of expectation between the problems. All learning that carries over from one PG/G to the next saves the student intellectual effort. 22. Use a standard display order (see Figure 5-2.1.) . As with 21, this increases commonality. In addition, it saves time because the student can read important information while more is being displayed, and because generation time will overlap with display time. (Central processor execution continues as soon as display commands have been transferred to a buffer for subsequent transmission to the terminal.) 23. Put multiple problems on a page . In this way a student can do more work with each overhead penalty. The display time for the problems might be the same, but there is display overhead in headers and key instructions and there is "context switch" overhead in recalling the meaning of a problem format. (However, care must be taken to avoid a cluttered appearance on the page.) 2k. Specify clearly the set of possible answers . If the PG/G may reject answers and if the form of the answer is not obvious, some statement of the expected form is needed. Preferably though, the question will be revised to the point where the expected form is obvious. 25. Use standard problem wording . While it is clear that the wording of a problem can seriously influence a student's ability to answer it (especially for foreign students), it is entirely unclear how to best word questions. This must be left for other research. 26. Do not use special character sets . The overhead to load such character sets is unacceptable in an exam. It is acceptable to load as many as five special characters for special applications. Unfortunately, there is not even a good way to disguise the load time because characters should be loaded before beginning the display in order to avoid a disconcerting pause during the display process. 27. Control difficulty differences . It is essential that two students taking the same exam at the same difficulty level receive problems of nearly equal difficulty. To do otherwise is to raise serious questions of fairness. In order to control difficulty, the designer must carefully list all the problem components which can affect understanding; these include expression complexity, statement order, and even such trivia as the length of identifiers. 18 B C E Problem 3 27 points What syntax errors are in this statement? D0 1=1 T0 io,i; M - missing E - Extra R - Replace shift - NEXT for next problem shift - DATA for cover page generate C Figure 5.2.1. Standard Page Layout and Display Order . If problem generation is fast, the display should be in the order B, F, C, D, A, E so the generation will occur while B is being displayed. For slower problem generation the order should be B, A, E, D, F-C where F-C implies piece-by-piece display. If there is initialization for F, it should start as early as possible. 19 5.3. Entering Answers 31. Follow the standard control key conventions . In this context, the most important conventions are those which specify that NEXT and BACK move the answer arrow up and down. These must be followed so the system behavior will always be predictable when the student is at the most vulnerable — i.e., entering answers and trying to remember the answer long enough to get it completely entered. 32. For every student action provide a system reaction . Though it may be sufficient to respond to an answer entry by simply moving the arrow to the next position, it is probably better to also move the answer. This indicates clearly that the system has accepted the answer. Other actions that deserve response are all key presses, however, it is PLATO convention to simply ignore illegal presses of the control keys so this is a reasonable procedure to follow. 33. Use the standard scheme for changing answers . This follows from the previous proverb . The answer has already been moved to the answer slot and is readable while a new answer is being entered. It is important to note that if simply NEXT or BACK is pressed, the existing answer is unchanged. To erase an answer the student must type a blank and then press NEXT (or some other control key) . 3^. Use the system "arrow" command for input . This command provides a number of features like erasure and editing which the more sophisticated students use advantageously. It is also faster than any other scheme of entering answers (since the only other scheme on PLATO is to accept characters one at a time and invoke the program for each character). A corollary is that the standard system "arrow" character should be displayed before the answer slot. (one of the experiment PG/G's used another arrow shape and this caused some very small confusion.) 35. Accept answers only when a control key is pressed . This proverb forbids the practice of accepting a short or one letter answer as soon as it is typed. Doing so causes two difficulties. First, the habit of not pressing NEXT is reinforced. Subjects sometimes forgot to press NEXT when it was required and thus wasted time. Second, and more critical, the student loses a chance to review the answer before making it final. Even though the answer can be changed, the student is likely to perceive a loss of control over the behavior of the system. 36. Always accept answers when a control key is pressed . On several occasions during the experiment and our personal use of the system, the correct answer was entered and then some control key other than NEXT was pressed. This resulted in an ignored answer and a loss of points. This problem can be avoided without great difficulty using facilities provided by the TUTOR language. 37. Do not allow the user to use special characters . Users should not be required to use special characters (using FONT or MICRO) in answers because special learning is needed to know how to enter those characters . The proverb is stated in this stronger version so that authors do not even consider the possibility of special characters (which are discouraged by proverb 26 anyway) . 20 38. Accept answer lists . In the DO loop problem and others, more than one answer is to be placed on a line. Sometimes a student will enter both answers at the arrow even though the arrow will move to the adjacent slot for the next answer and ignore the answer already supplied. To provide for such eager students, PG/G's must take the special effort required to capture all the information entered _at every answer slot. For example, if more than one number is entered and there is more than one slot on the line, the answers should be spread across the slots. This proverb also requires the production of error messages when the student enters answers that will not fit into slots on that line. Extra characters should not be ignored as they can be with standard PLATO mechanisms for accepting numeric answers. 21 5.1*. Errors and Help 1*1. Avoid rejecting responses . Many problems with response rejection appear in the subject logs (Appendix E). Responses are not rejected on paper and students are disturbed when a response is rejected. 1*2. Give prominent feedback on response rejection . If response rejection is unavoidable, the feedback must be large enough to see and should be placed on the screen close to where the student can be assumed to be looking. A message can be highlighted by flashing it as long as the flashing routine is not deaf to keys (i.e., any keypress should terminate flashing and proceed). 1*3. Give precise feedback on the reason for rejection . A simple "no" is not enough. The message should state what is wrong and even suggest how to fix it: "Do not use 'E', move the decimal point," "Specify a replacement character from (, -, *, ,, ., =". 1*1*. When a response is rejected, do not go to a special state . Many subjects had trouble with the "Replace" option in the syntax problem; another lost points because of the irreversible nature of the DO loop problem. After rejection the system should be in exactly the same situation as before the attempt, except that the rejected response should be visible and proverb 1*5 should be followed. The terminal should behave as a problem-independent, answer-slot-filling device. 1*5. Arm the HELP key when a response is rejected . If the student presses HELP immediately after response rejection, the assumption is strong that information concerning the rejection should be offered. Consecutive presses of HELP should elicit more detailed information, although this must be limited by the time available to implement the problem generator. 1*6. Use on-page HELP when possible . For both ordinary help and response rejection help, it is preferable to display the help information on the same page as the problem even if this requires erasure and regeneration of up to a quarter of the screen. By supplying help on the same page, the student will be able to refer to the parts of the problem while reading about them and will not suffer the context-switch delay when returning from a help page to the main problem. ' 1+7^ Avoid designs that rely on the HELP key to assist the student . Such designs are certain to be too difficult; they need more work to reduce their complexity. 1*8. Since the HELP key is not to be used, use it copiously . Use HELP to explain the obvious even items that are already explained on the page. Take care, however, against confusing extra words with extra explanation. Explanation is only valuable if the student can find it. 22 h9 Do not assume the student has read and remembered the instructions. There are so many things to remember (the facts being tested, the particular instructions for the question, and the general procedure for using PLATO for the exam) that the student cannot initially remember them all. As a consequence, the system should not be harsh with students who have made some mistakes. hk. The proctor must give explicit aid . On early experimental sessions we attempted to help students only minimally; i.e., by pointing to the instruction that would solve their problem. This is unsatisfactory. When a student turns to a human for help, it is evidence of complete failure to find comfort in the system. The more quickly the human can explicitly help the student back on the right track, the higher are the student's prospects for comfortable interaction with the system. 23 5.5. End of Problem On paper when a student finishes work on a problem, the work can be continued on the next problem directly. On PLATO, a few more considerations are needed. 51. Emphasize the last answer . Some of our PG/G's, for example, the arithmetic expression one, simply move the answer arrow back to the first answer slot after the last answer is entered. On one or two occasions we observed students exhibit befuddlement over the implied suggestion that they try all answers again. As an indication of the end of the problem, we suggest that before returning the arrow to the top the PG/G should cause the key convention message to flash. This message includes the information of how to move on to the next problem. 52. Avoid cover pages . Because of the time to display a page, PG/G's should avoid nonproductive displays such as cover pages. Most problems can be displayed entirely on one page, but some — like short-answer — are more convenient with one page per subproblem. The difficulty is that there must be some way to go to individual subproblems . Certainly access to subproblems must be possible via the simple shift-NEXT and shift-BACK used for transfer between problem generators. That may be enough if the number of subproblems is kept small. As an alternative, we have experimented with a vector of problem numbers displayed at the bottom of the page. The difficulty with this scheme was that response to the answer arrow could be either an answer (distinguished as alphabetic) or a number specifying a new subproblem. This added confusion to both answer entry and subproblem selection. It might be satisfactory, however, if the index vector were only displayed on command. An as yet untried alternative is to have a "mini"-cover page with just the problem numbers and the student's answers (the latter as a clue to the problem contents). In this scheme the small amount of text could be quickly displayed. 53. Adopt a standard end-of-answers convention . On paper exams it is usually assumed that a student has written all the answers felt to be needed. For example, on a DO loop problem the answers given will be graded. Occasionally a syntax error problem will provide the option of stating that there are no errors. Such exceptions are more difficult on PLATO because the student is more uncertain of how to enter answers; the system constrains answers in so many ways . Of the experimental problem generators the DO loops and syntax error problem both expected "non-answer" answers: "end" for the end and "N" for no errors, respectively. Ideally, problems with empty-list answers would be eliminated, but this is not always possible. This proverb demands that there be a single "non-answer" convention for the system. It should be an "end-of-answers" code rather than an "empty- list" code because students can use the former even at the end of non-empty lists. However, even if the code is omitted from a non-empty list, the grader must make the standard assumption that the student has entered all the answers the student felt necessary. A reasonable end-of-list code is "end". One refinement might he valuable: When a student attempts to leave a problem without making any entry in a list, the system might flash a reminder of the empty-list convention. Unfortunately, this conflicts with predicatable behavior because the system convention is to transfer to the requested problem immediately when a transfer key is pressed. One partial solution would be to flash the end-of-list message (which should already be visible in the problem statement) and then transfer to the requested problem after a very brief pause (say 2-3 seconds). 25 5.6. Grading In one sense automated grading is absolutely fair since the same algorithm is applied to every student. However, the system could be inequitable to the extent of basing the grade either more or less on the ability of the student. The proverbs in this section are intended to suggest how grading can more accurately reflect student ability. 61. Give partial credit . The "precision" of a grading algorithm can be measured by the number of possible grades it can assign. An all-or-nothing grade less accurately reflects gradations of ability than a grade that may take any of several intermediate values. Even some of the graders used in the experiment gave partial credit: For correct value with wrong type or sign on the arithmetic expressions, for incorrect print position, for correct second guess on the DO loop problem; but more can be done — such as proximity of value (especially in terms of character representation), and whether the judgment of presence of syntax error was correct. In most cases the amount of partial credit can and should be left to the judgment of the instructor when selecting options. 62. Use relative grading . In many problems an answer to some subpart depends on the previous answer. In manual grading of such problems it has been traditional to give credit for subsequent answers based on the previous value; thus, an incorrect value would itself be marked wrong but its effects would not propagate. This tradition can and must be continued with computer administered exams. Moreover, because of the problems of typing accuracy, subsequent answers that are correct in an absolute sense must also be accepted. This maxim applies in the obvious sense to noninteractive DO loop problems, but also applies to the print FORMAT problems. The latter grader judges the position of an answer as relative to the position of the previous answer. It must also perform a pattern match to determine which strings in the answer match which of the problem data. 63. Do not use interactive grading . Our subjects had fairly strong reactions to rejection of answers by the interactively graded problem on DO loops. One student lost points simply by failing to note the error message and supplying the second iteration when the program called for a retry of the first answer. Many proverbs, including "Provide Reversibility" are violated by interactive grading. Moreover, the benefit expected from interactive grading — reduced error propagation — can equally well be achieved by relative grading. If interactive grading is to be used, it should be isolated and carefully explained to the student. It is, however, possible that an entire exam of interactively graded questions would be acceptable, but this would presuppose construction of numbers of interactive questions, and the time is better spent devising relative grading algorithms . 26 Appendix A. The Exams Used in the Experiment Two instances of the exam were used: form A and form B. When administered on paper, the questions shown here (p. 29-33) were presented; when administered on PLATO, slightly different versions were presented to each student. The two exams each had four problems: 1. Evaluate five FORTRAN expressions (the system chose five from a collection of about 20). 2. Find syntax errors in three statements; each statement was presented in a separate display. (Form A had assignment, ASSIGN, and PRINT. Form B had GOTO, assignment, and DO.) 3. Show the output of a PRINT and FORMAT statement. (Form A used three F-format items, Form B had three I-format items.) h. Show the output of a PRINT in a DO loop. (This problem generator/grader attempts to provide interactive grading so an error on one value does not mean loss of points on subsequent values. In addition, the generator creates many different variants, possibly involving sub script -out -of -bounds and modification of values used by later steps.) The practice exam consisted of two expressions from 1 above and one syntax error from 2. The following pages illustrate the question formats used in the exam; they are taken from the paper version of Form A. The actual paper exams differed as follows: 27 Form A: As shown except problem 2 (FORTRAN syntax) had two more pages with these FORTRAN statements: 12 ASSIGN. 25. TO. WUZXIYHJ 1 PRINT. ,X,XQUX (The dots indicate spaces; they are one-fourth the size of a period.) The ASSIGN was a great trial to the students "because (a) they had not previously encountered an ASSIGN statement, and (b) "extra HJ" was not an answer that would have been accepted by the problem generator/grader. (Generation of the 8-letter identifier was a bug.) Form B: Problem 1 . Using the same variable values, the expressions were N+IFIX(S)/I, B**L**3-0, (S-B)*CA, A/C/B*3, -B**C Problem 2 . The three statements were 68 GO. .TO. 778 3 AL=MKFT 111 D0.7.JFM=9*31 Problem 3 . 1=68 J=8 K=2 PRINT 20,I,J,K 20 FORMAT('l , ,2ll+,I5,I5) 28 Problem h . INTEGER M(6) READ,M DO 20 1=1,8,2 IF(I.EQ.6)G0 TO 20 M(I)=M(I)-M(3) 10 PRINT, I, M( I) 20 CONTINUE Values for M: U, 3, 6, k, h, 6 This is harder than form A because M( 3) is modified before the last iteration. 29" V rri XL J- O i 1 • iri - -1 J -^ ' m i o 4-> _J - »-H rri i/3 1~ 3 2 E V LJ 30 •in o ii in "L o JLl V r, L PL •ti •I' "0 ••• Hi 'v *— C iTj I— I llJ (- 10 III III r-. ;< hi \X\ ■ r-t V L — c '■« o at 12 >r- i'i (D •1' h 1 ' IA CO II II N 21 iS CM II II > T- IA • i # ii* CO CM f II II <5! CO CO II II OJ •-) CO II II (L i-i QJ I N -> U i CO \ CL i ITI 0Q "X CD iS. ^r • \ CO U £t » .X. . — \ _J CD # l * CO QD Wl 31- o # III 3 in i t '1' 'T.i -I-' r fj"i LL 'i .:*) 'V. g iT> -V 4-' :\ v i0 E V '!_' 4-' a- id 4-' 4-' in 111 "•J I- C LL ■ ■■* <_.. U \ ro ro i iZi ii V £ 2 in - 1- i' *— u 'V # *— i ro CO 1-_ m o u rV .-■" i :i !- ■ WS 'V 10 ' £ +-■ 'ti iri K fc L'UO L- o ,- ' i- <4-» Cl Q_ 11. l Cl V 3 JC CO <\J in ^r -► <£; (35 ■ CO ■-•• r,-, Lu . *-* • . L. ' x" CS. * " CM CSJ M CM U m* - Lc OJ - ■t - a: - CM ~ - tS <£> V & w ' p4 CO U'l CM H 00 CO r--- 1- £> y C\J c "13 O (_ h- »— i t— \ •p t\l O 1— 1 *- ,— .. ». o + x — ' y •D vO 21 u. s -' ~ ^ ,_^ n .-H CO CO m. in II . v - x I-H U -< a: HH o- T~~* . _l j- Ld \ u M t-*5 1 1 L r > w. iS" . U Q CM HH ,-^ Z J- :.''! h- •I ^-' H4 n 21 1' *£- U Q L_ •^-/- q: o t-i q: Q »— i V" O. U ^_' ., fc? ts 't' *-• CM ll"l I I' £ 4-' • »-« *£' CM Lfi •»— < 4-' • M'l 00 Si - CSJ sj - <\i a.' r— < 3k There is art to creation of a successful Problem-Generator/Grader . The four used in this experiment illustrate many of the possible difficulties. PI. FORTRAN Expressions (author: L. R. Whitlock) This PG/G was designed from the start as a makeshift. It provides only a standard set of 12 variable definitions, and a small library of expressions. The only "generation" is to select expressions from the library at random. Since the experiment, a more sophisticated generator has been implemented. Only two features of this PG/G annoyed subjects. First, it rejects answers using "E" for exponent, insisting, for example, that 1 . 5E2 be written 150. This is reasonable — especially since no variable value has an "E" — but it raises the question as to what is being tested. Second, the arrow moves from the last answer slot back to the first as discussed in proverb 51. P2. FORTRAN Syntax (author: F. J. Izquierdo) This problem-generator/grader was the result of the Master's project described in Izquierdo, F. J., "A Generator/Grader of Problems about Syntax of Programming Languages to be Used in an Automated Exam System," UIUCDCS-R-T5-T55, Department of Computer Science, University of Illinois, Urbana, Illinois (1975)- A very important contribution of this effort was to show how to do a reasonable job of generating syntax errors without generating ambiguously correctable constructs. The approach is to generate errors by first generating a correct statement and then modifying symbols chosen from a certain set. By careful choice of this set the system avoids most possibilities for ambiguity. 35 The generator was completed "before the exam system and independent of it, so any inconsistencies between the two are not mistakes but are simply due to different choices. This discussion will nevertheless point out the differences because they illustrate the kinds of differences which can cause annoyance to the user. Such differences contribute to nonpredictability and thus violate proverb 01 . a. The most serious difficulty with the syntax problem was a design error — when an answer was rejected a mode was entered wherein only a valid answer could be supplied. This occurred after entering "E" and "R" as the error type and then trying to specify some character not on the set of modifiable symbols. Almost every subject had trouble with this feature and had to be helped out verbally. (The problem was compounded by the fact that the messages describing key conventions were visible even though those keys were inactive.) See proverb 31 . b. A second area where subjects — at least all the videotaped ones — had difficulty was on trying to move from one subproblem to the next and trying to leave the syntax problem. Within the PG/G, shift-NEXT always went to the next subproblem, but after the last subproblem it returned to a syntax problem cover page. On the latter, shift-NEXT returned the student to the first syntax subproblem. This has been fixed by arranging the conventions such that the syntax problem appears to be consecutive pages with the last subproblem preceding the next problem page on the exam. See proverb 52. (E.g., the pages on the experiment would be exam-cover, PI, syntax-cover, Ql, Q2, Q3, P3, and Pl+; and shift-NEXT and shift-BACK move forward and back in this list.) c. Unfortunately, the difficult goal of providing good problems was only partially met. Many examples of incorrect problems have been noted: the eight letter identifier on form A, generation of a statement longer than the space available, 36 generation of a not-yet-introduced construct (CALL) as the object of a logical IF, a legal statement resulting from an intended error (DO 10 1=9*31) implied semantic errors — DO 25 KD = 93 13 (the bounds would have to be in ascending order: 9, 313) statements with ambiguous corrections — DO 10 I = 6*, 1 or INTEGER 1(5/ (initialization or dimension) d. One way to solve the problems of (a) and (c) would be to revise the solution technique: the student would enter a new version of the line and it would be graded on its proximity to the original and its lack of syntax errors e. A less easy to solve problem is the generation of problems of very diverse difficulty. Some had no errors and some had only very trivial errors. Exhaustive study would be required to determine a hierarchy of difficulty suitable for doing better than this generator. Perhaps in the meantime this generator is suitable as a drill device rather than an exam device. f. This was the only generator to delay subjects by loading a special character set. It did so because it uses the conventions (and some of the internal mechanisms) of the ACSES compiler system, which uses the same special character set. g. In examining the logs, it can be noted that three activities were recorded for displaying a syntax problem. These were Present, Generate, and Present; the generator already follows the practice of displaying the basic problem statement before going ahead to generate the problem (though not before loading the character set ) . 37 h. Like the compiler system the syntax problem displays blanks in statements as a small dot (one plasma dot where a period is four dots). This led two of the videotaped subjects to ask whether the dots were the errors. Conversely, another subject inquired whether the error might be that the characters were in the wrong columns , so some statement that columns are correct is necessary. i . There were a few trivial details in which the syntax problem differed from the rest of the system: its cover page does not display the problem number of the syntax problem relative to the exam, a problem yet attempted is marked with an asterisk on the syntax problem, while the same mark is used to indicate an attempted problem on the exam cover page , the problem uses a different graphic for the answer entry arrow. P3. PRINT FORMAT Problem (author: L. R. Whitlock) Few students attempted this problem because the material had not been covered in lecture by the time of the experiment. a. The generated FORMAT always includes a '1' as the first item, but there is no way to indicate that a line is to appear at the top of the page. Either the '1' should be deleted or one of the output lines should be marked as being the top of the page. b. There is no control over how many test values will require rounding up before they are printed. In consequence, there are considerable differences in level of difficulty; for example, form A required rounding up three times while some PLATO versions of the problem did not involve rounding at all. (indeed, at least two of the seven subjects lost points solely on failure to round up.) 38 c. This PG/G was one of the first to feature relative grading and partial credit. If the answer given does not exactly match the correct answer, the PG/G looks for the correct value in a wrong position and subtracts only a small number of points if a match occurs. If the answer still does not match, the fractional part is dropped and a match on the integer part is sought; again a success leads to only partial deduction. With this algorithm, a student will usually not lose full points for failure to round up (unless the rounding carries into the integer part as happened on one example). d. The design of the answer entry scheme for a print FORMAT problem is difficult. Originally this PG/G displayed an extra line (with the tic marks above and below it) to enter a line for the output grid and a question as to which grid line the user wished to fill. After specifying a line number, the user entered text in the extra line and pressed NEXT; then the system transferred the contents of the extra line to the specified position in the grid. This scheme caused difficulty for at least SD and SG. It has since been revised so the user is asked what line to fill before the extra line is' displayed. In this way, attention is first directed to the grid; then when a line is selected the means to fill it is provided. Yet another alternative would be to put in the text first and then specify where to put it, but this leaves no destination for it if the user exits other than by pressing NEXT. Ideally, perhaps, there would be a standard text editor associated with the exam system (and the comments system and the quiz system) so the user could move a cursor to anywhere in the answer area and enter answers. 39 PU. DO Loop and Array (author: B. Speelpenning) There were two goals in the design of this PG/G, first to use interactive grading, and second to attempt to get maximum variability from a small generator. Both were met. On a few occasions students did get more points because the grading algorithm stopped them from propagating an error. On too many other occasions, however, students lost points from errors they would have spontaneously corrected and lost confidence through rejection of an answer. Maximum variability was gained, but at the expense of creating problems of greatly disparate difficulty. a. Examples of disparate difficulty include that the predicate in the IF was never true in some versions, that the termination was sometimes by subscript-out-of-bounds, that the increment for M(l) was sometimes a constant and sometimes an array value, and that the array value increment was sometimes modified during a pass through the loop. The two versions used on the paper parts of the experiment illustrate the possible range of difficulty. b. The answer entry mechanism required a specification that the last output had been generated (see proverb 53). c. "del" was provided to delete an answer, but only applied to the first of the two values on a line. The subjects did not understand how to use this feature. d. The interactive grading was a more severe problem on another exam when the interactive problem was not given as the last problem. Students tended to fret more about every answer on every problem after doing one problem interactively. e. Two more problems with interactive grading were that people did correct themselves on the paper exam in ways that would have lost points on the interactive system, and that one subject entered the correct response to the second line while the system was waiting for a retry on the first line. f . At least one subject lost points by pressing some key other than NEXT after entering an answer. Ul Appendix B. Questionnaires Early in the experiment session subjects filled out the "Background Sheet" (p. U2) . Their answers appear in Figure k.l. After the exams subjects filled out the "Questionnaire on PLATO Exam and on Paper Exam" (usually referred to as the post-test questionnaire) (pp. h3-h6) The "satisfaction" score in Figure h .2 was derived from questions 1, h, 5, 9, and 11-15 of the post-test. The weights were question 1 k 5 9 11 12 13 lU 15 yes = 2 no = h The average satisfaction was 3.11 with a standard deviation of . 7*+. Subject responses to post-test questions 1-15 appear in Figure k.2 Their written comments in reply to question l6 are transcribed below, following the questionnaire. re sponse a b c d e 5 k 3 2 1 5 k 3 2 1 5 3 1 1 3 5 5 1 3 h 5 1 2 k 5 1 2 3 k 5 5 1 3 3 k2 Background sheet For the evaluation of this experiment we need some biographical data about you. We appreciate it very much if you answer the following questions completely. All your answers will be kept confidential and used only for the evaluation of this experiment without mentioning your name. 1. What year were you born ? 2. What is your approximate college GPA ? 3. What year are you ? (Freshman, Soph., Jr., Sr., Grad. ) k. Which department are you in ? 5. How well are you doing in CS101 ? Poor fair good very good 6. What grade do you expect in CS101 ? ( A, B, C, D, or E ) _ 7. Approximately how many hours have you spent on PLATO ? > _ 8. How well could you use a regular typewriter before using the PIATO keyboard ? (Check one, please.) a. never tried b. with great difficulty (spent a lot of time looking for keys ) _ c. slow but sure (hunt-and-peck with only rare delays ) d. good to expert ( typed regularly ) fl 11 ft II 11 M Qre" , ti.or. v, "irq on PLATO exam and on paper exam 1. How did you like the PLATO exam compared to the , written exam ? a. .liked the PLATO exam much more than the written exam b. " " " " a little more " w H M c. " " " w about the same as " d. " " ■ "a little less than " e. " M " " much less than the 2. TThat do ycu think about the environment during the exams? a* the experimental situation bothered me very much b» H M H bothered me sometimes c. n ■ " M didn't bother me at all 3. 'That do von think of the content of the PLATO exam in general ? a. material tested was too difficult b. N " was challenging c» M H was of right difficulty tt It M H 11 M M It e. w " was too trivial 4. TThat do you think of the instructions and procedures for ~ettin£ around and rnswerin~' questions in the PLA m O exam? a, yerj r easy to follow b. easy to follow o. clear, but not obvious d. difficult to follow e. very confusing kk 5. Do you think you were able to concentrate on the PIATO exam ? a» I was able to concentrate b. PLATO disturbed my concentration sometimes c. I was not able to concentrate on the problems 6. That do you think of the content of the paper exam in general ? a. material tested was too difficult b. . " ". was challenging c. " " was of right difficulty d. " w was easy e. " » was + 00 trivial 7. Do you think you were able to concentrate on the tt enc 1 1 a 1 ^ d ^are*'* exam ? r. 3 'v ..^ ^hle to concentrate b. my concentration was disti?rbed sometimes c. I was not able to concentrate on the problems ?. : T cv. r frequently did you switch to another problem on the PLATO exam ? a. very often b. sometimes C. only once to each problem 9. How much difficulty did you encounter when switching from one problem to another on the PIATO exam ? a. a lot of difficulty b. com' ffic'.'lty c ver«* little -' suit*; k5 10. How frequently did you switch to another problem on the paper exam ? ■ a. very often ' b. sometimes c. only once to each problem 11. Do you feel that the keyboard was an annoying hindrance in ccmjranicating with PLATO ? a. it T7S.S no hindrance for me b.^ it hindered me somewhat c. it was net more of a hindrance than a pencil d. it v:as easier to use than a pencil ft- I like us in? the keyboard much more than usinT a rencil w • 12. Do you feel that .your typing ability had an influence on your performance on the PLATO exam ? a. it ha.d a Teat influence b. it had some influence c. it had almost no influence d. it had no influence at all 13. Do you feel that reading the problems from the screen took more time than reading them from paper ? a. it took much more time b. it took some more time c. I didn't feel a difference d. it took lesr; time e. it took much 3_e3s time 1 ^ . That kind of an exam would vou ^vefoT ? a. er'"m en P^ATC b. TT.uer md ■^2"t^- : i errem c. part of e::am on PLATO and part on paper d. I don't crTf; U6 15. Vfas there anything on the PLATO exam that caused » you more delay than you wanted ? yes no 16. Please, write down anything that bothered you on the experiment or anything that bothered you on the exams or anything that caused you too much delay or any suggestions for further improvement of the exams. ".'c thank ^ou very much for ^our ^ci^ hi Responses to request for comments: l6. Please, write down anything that "bothered you on the experiment or anything that bothered you on the exams or anything that caused you too much delay or any suggestions for further improvement of the exams. SA : I feel that the directions were very confusing, especially with going back and changing answers and leaving a problem to go on to the next one without answering the previous problem. If these exams were ever used I feel that better instructions on how to use the machine would help alot . SB : In the syntactical error part it was difficult to erase a wrong answer on the PLATO test. SC : The examples used both in the Plato exercise and the written exam were relatively easy except for one section on Format which we haven't covered very much in class. Being conscious of being watched and recorded did make me apprehensive. I think this bothered me a little more on Plato because I'm not used to taking tests on it . I was less apprehensive with the paper and pencil test, I guess because I'm used to being watched while taking paper and pencil exams. Another aspect is with Plato you are "broadcasting" your answer to the world where in a paper and pencil situation if you think an instructor is trying to evaluate your answer while you write and this bothers you, you can cover it up easier. SD : All of the instructions weren't clear. The atmosphere had some effect on concentration. I like the way in which you can return to problems. I think now that I have experienced the exam on Plato, I would do better on following ones. Being a totally different experience, it affected my output a little . U8 SE : Some instructions on Plato weren't very clear. I found an error in correction of syntax DO statement. Hadn't covered format material as of yet. Generally liked Plato exam though. SF ; I didn't mind the Plato exam. In fact, I think taking an exam on Plato could be quite enjoyable, but first, before I decide to take a real exam on Plato, I would have to gain a little more experience on Plato itself — its terminology, procedures, etc. This experiment seemed be, though, a bit "unfair" to Plato. We took an unfamiliar exam 1st on Plato and then on paper (after I knew what was going on). Maybe the exam should be "split" to where 1/2 of the exam is taken on Plato 1st and the other half on paper 1st. SG : Syntax question difficult to work through due to problems with replace mode . k9 Appendix C. Experiment Administration For each experimental session an "Activities" (p. 50) outline was drawn up prior to the experiment; the outline for the first session is included below. Subsequent sessions differed in the order exams were given and also in that only one tape was used for subsequent sessions. The "User's Guide..." (p. 51 ) was given out because earlier experience had shown its necessity. In retrospect, the instructions should contain more details on how to perform the activities that are said to be possible. The "Time Sheet" and "Log Sheet" (p. 52) were separate documents, but both served purely housekeeping functions. 50 Actiii/i-ries during ^ rrc> CI- J '~i^-y> LHTa C&.) Tcxfce^ exam 3^^^ Change £< r e, £^-y FVK oucf aoL&stit'onncure. CcK^cf- k cx\rcx\oachzgrounc^ sheets 2L tit me. sheets 2- e.x.ams ^ Tfeoer .2. c'j u^e .-j ttonncttre s 3 loq r^heerhz, I I papery f- ^ -tixf^cs 51 User's guide for PLATO exams These are a few hints about how you can proceed in the PLATO exam. l) For all sections of the exam ( other than the DO-loop problem ), please remember : - You can go back and forth to look at any problem or at any answer you gave. - You can work at any problem at any time and as often as you wish. - You can replace any answer you gave at any time during the whole exam. - Your answers will not be judged or graded at once when you type them in. ( Your answers will be graded only after you finish the whole exam . ) 2) In the DO-loop problem only your answer will be judged immediately after you type it in . A wrong answer can be corrected only once and you will get partial credit for that answer . 52 Time sheet Name of student Begin of experiment Begin of practice exam End of practice exam Duration of practice exam Begin of PLATO exam End of PLATO exam Duration of PLATOexam Score Begin of paper exam End of paper exam Duration of paper exam Score End of experiment Log sheet Session Student Content Activities of the student: Time 53 Appendix D. Definitions of Categorization of Activities The following categories of activities were used to classify the subject behavior observed on the video tapes. Titles include brief name, code in parentheses, and long name. Think (RT) Reading and Thinking: While the student reads the problem and thinks about the answers . Calculate (CP) Calculating with Paper and Pencil: Whenever the student makes calculations on paper. Answer (EA) Entering Answers: Includes time the student spent looking for the desired keys; deciding which key should be pressed next; waiting while answer is processed by the PG/G; and redoing answers the PG/G rejected (though some of this was recoded to Trouble ) . Select (PS) Problem Selection: The period that begins with the keypress to leave a question and ends with the keypress to go to the next question. This period includes travel time to the cover page; time to display the cover page or instructions on how to get to the next question (e.g., in the syntax PG/G after the student pressed the key to indicate he was through working on the present question, a list of keys was displayed which permitted him to go to the next syntax question, the previous syntax question, or the syntax problem cover page); time to read the choices and make a selection; and time to enter the selection. 5h Most of the categories are well-defined, but it is appropriate to discuss the evidence we used to differentiate among Think, Answer, and Trouble. Usually during Think time the subject was motionless or fiddled with some object but stared at the exam or paper. Answer time was almost always time when the user bent the upper part of the body toward the screen/ paper and moved the hand toward the answer mechanism. A perceptible relaxation of postural attitude usually marked the transition back to Think. Trouble time was identified by turning of the head toward the proctor and the asking of questions; it was also evident when the subject had completed an answer and was searching the screen and the surroundings for clues as to how to proceed. Knowledge of the various kinds of difficulties the subjects experienced — as recorded in the manual logs — also helped to identify Trouble time. 55 Appendix E. Detailed Logs of Subject Behavior This appendix includes details of the behavior of each subject, although with far less detail for the unvideotaped subjects. For each taped subject there is a log, annotations for the log, and an "Analysis of Activities." Only the log annotations are available for the untaped subjects. In deriving the logs presented here from the second-by-second analyses of the tapes, it was observed that occasionally Trouble time had been recorded as Answer entry time. To present a more appropriate picture of the amount of trouble the subjects had, a recoding rule was adopted: If an Answer time was inexplicably longer than ten seconds, it was reduced to six seconds and the rest of the time was counted as Trouble time (or sometimes Read and Think time). We further decided to count such converted intervals as separate activities which tended to slightly reduce the average size of some activity categories. Each line of a detailed log corresponds to work on one problem. The time shown is the first second of starting to select the problem. (For confidentiality, all times have been shown as between eight and ten.) The problem numbers have been encoded with capital letters for PLATO and lower case for paper according to this code: PLATO paper PI Pi P2 p2 01 q.1 Q2 q2 Q3 q3 P3 P3 PU P U quit 56 problem arithmetic expressions the entire syntax problem the first syntax subproblem the second syntax subproblem the third syntax subproblem the PRINT/FORMAT problem the interactive DO loop problem making the decision to terminate The eight categories of activities fall naturally into the five basic categories shown in each log: Select — There are three types: selection from the exam cover page, selection from the syntax problem cover page, and selection of the next syntax subproblem without going to the syntax cover page. Display — Includes problem presentation, problem generation, and loading character sets. The latter two were small: Load was within a second of :19, Generation of P2was about :29; Generation for Vk was only about : 01 . Think & Calculate — Includes all time spent reading, thinking, and calculating. Answer — Answer Entry time. Trouble — Was encoded in the second-by-second logs as WN, "What next?". Each entry in the log has three fields: time in minutes and seconds, number of activities, and notes. Only the first is usually present. The number of activities is preceded by a slash; if omitted, there was only one Each note is preceded by an asterisk; the notes have these interpretati 57 *1 The time shown includes some problem Generation time. *2 The time shown includes some character set Load time. *3 The time shown does not include character set Load time "because there was no load. Normally, at this point there would have been a Load, but the character set had been loaded during the practice exam. *k The time shown includes some time converted from another category as described above. This is usually mentioned in the annotations. 58 Detailed Log for Subject SA PLATO exam Wall Time 50:03 52:55 53:57 58:06 58:33 59:53 05:55 07:17 08:08 12:16 12:30 13:07 Pr # PI 01 Q2 Q3 P3 PU PI 01 Q2 Q3 Fk quit Select Display 05 22/2 01 02 32/3 09 15 16/2 02 02 20/2 07 02 18/3*1*3 20/3*1 18/3*1 08 08 Ok 28/2*2 07 05 12 Think & Calculate 2:05/7*^ :18 1:16/5 :06 :27 5:3U/11 1:03 :07 :59A :07 :05 Answer 2U/5 Ok lU/3 01 :11/10 :19/U Trouble 2:l8/6*U :13 2:Ul/5*U paper exam Wall Time 15 = 31 17:29 17:50 18:10 19:36 19:^8 21:31 21:51 22:27 Pr # Pi ql q2 q3 P3 P^ q2 q3 Pi Select 01 01 01 03 01 02 06 01 07 Think & Calculate 50/5 18/2 19 16/6 11 29/7 13 30/2 16 Answer Trouble :07/5 :02 07/3 12/6 01 05 59 Log Annotations for SA practice exam On the arithmetic expression problem, needed verbal explanation of why the arrow returned to the first question after entry to the last one. On the syntax problem, had trouble getting back to exam cover page. Spent l:l8 selecting problems and waiting for displays, 1:55 on productive work, and 2:17 on difficulties understanding instructions. PLATO exam 8:50:03 PI The 2:05 thinking time was composed mostly of calculation with paper and pencil (l:50). There was :07 pure reading and thinking time and :08 recoded from entering answer time. Illustrating a problem with the PLATO terminal, SA entered an underscore for a minus sign; this was quickly corrected though. (Underscore is a normal upper-case key; minus sign is on a black key at the left of the keyboard, together with plus, times, and divide.) :53:57 Q2 Subject tried to give answer "E ASSIGN", meaning the word ASSIGN should be removed (it shouldn't, but subject didn't know the ASSIGN statement). PG/G would not accept that response and would not permit any action except to enter an acceptable "extra" character. Finally, we told the subject how to enter an acceptable character. Because BACK1 hadn't worked while the PG/G awaited the "extra" character, subject was then confused as to how to leave the PG/G. (The trouble time shown includes : 39 recoded from entering-answer time.) :59:53 P^ SA had considerable difficulty with the semantics of DO loops and the mechanics of interactive grading. After a :*+3 think, SA entered the wrong answer for the first line, but failed to notice that this 6o answer was rejected "by the PG/G. After another : 51 think, subject entered a response for second line, but PG/G assumed it was the retry for the first line and gave the correct answer to the first line. After thinking for another 1:07, the subject correctly entered the answer for the second line. After only :1T, SA made an entry for the third line, but it was wrong because subject failed to observe that the array had been modified by the previous step. Noticing the rejection and taking great care, S2 re-entered an answer for line three after 1:23, but still got it wrong. (Answer expected was "sob" for subscript-out-of-bounds.) The two longest think times were recorded when subject was told previous response was wrong. 9:08:08 Q2 Subject attempted the "R"eplace option and had great difficulty; again because the PG/G would only accept certain characters as the characters to be deleted and inserted. Subject again had trouble getting out of the problem. (1:10 recoded from enter-Answer to Trouble.) paper exam 9:15:31 pi SA calculated all answers on paper. All the think time shown is calculate-with-pencil time. Only error was to get 27.0 as value of A/C/B*3. :l8:13 q3 The log record shows :08 calculate-with-pencil time, but no actual writing appears. :19:36 p3 Did no work on this problem. :19:^8 pU Each answer was written as it was computed. Subject got all answers except the value for M(5), which depended on the fact that the previous step had modified the value of M(3). 61 : 21 : 31 q.2 Answered correctly; had given no answer the first time through. :21:51 q3 Changed answer to "M (9*31)". :22:27 pi The Select time was large because the subject flipped through many pages. 62 < CO u o Cm CO CD •H -P •H > •H +> O < Cm O CO •H CO >> H aJ c < ^_^ o (M 0\0 o CM CM NO M >t 1 • • • • • • • • CD o| LTN -=!■ -3" H OO LTN LTN NO a ft CM H p crj CU PL, M 1 Pl CM LT\ CM t— o OO CO LTN CU o LTN H 1 CM rH H «m Eh S3 CM -^ OnO\ -=t CO O CO Q P-. cdl Lf\ IA O CO CM CM LTN H LTN LTN I VD LTN H t— rH -H- ON ON O t— VD NO ol CO OO VO CM On 1 CM CM pq ^ H H -cJ M M • a; »H p| C\J o\ t— t— OO 0^ ON P. P ltn H H -H- ' crj -P Ph w On _3" OO _=J" no Ol OO eel H CM LTN on LTN O CM CM t- J- H O NO _H- CM CO O [— CM O CM OO CM o O LfN CM o| 00 co H ro CM IT\ O ON H ON o H LT\ t— rH H CM rH CM H H CM H Ph -H Ph m •H p CU P 08 -H M t> O bO |> cu o cr3 co P H -H !> o CU •H -H +J o H CO C -H J*! o o o X crj M O P p p co a orj C CU •H -H •H CU M e P C O p> p >> > P Ph CU CU 1 £ O -H o crj crj -H •H -p cd U r-i ±> CO -H CU aJ a CU Ph O -d bfl c +> A *— ' *w ' CJ N P co crj P P cd o CU -H p CU -H cd -h p P p p H P o bDTj O •P Mt) o o CU crj EH bo a P o CU •H -H co M bO cu 53 p C H -H M H P P M P ■H 3 m Ph P O O M O •H H H Ph T) O CU P CU CU CU TJ P P r-i Oi r-i P H O H rH P « O P crj O O H ■P cd cd _p cd K O W P £ CU CU CO CO O P Ph Ph P O O H Eh EH 1 1 o ir\ o f CO o o H CM O LTN LTN O o o CO > •H p t> •H •H !> -P •H O +J - — «» ■■: o • < p cm •H O M g CU U Ph CU CU P CU i •i i a i i : -. EH cdl^il ol 63 Detailed Log for Subject SB PLATO exam Wall Pr Think & Time # Select Display- Calculate Answer Trouble 8:^5:01 Pi :09 : 02 1+-.U0/9 :26/6 :50:l8 01 :20/2 10/2*1*3 :0T/2*U :06 : 51:01 Q2 :01 21/2*1 :U7 :01 : 52:11 Q3 :05 lU/2*l 1:13/7*!+ :22/5 1:2U/3*U :55:29 Ql :11 08 '.06 :3i+/2*U :56:28 P3 20/3*U Ok 1:26 :58:18 Pk .12 07 3:28/10 :U8/9 :l+2/3*i+ 9 =03:35 P3 15 08 :k0 :0U:38 quit :09 paper exam Wall Time 27:^9 31:51 33:20 36:37 37 = 16 U0:50 1+2:35 Pr # Se] Lee - Pi 01 ql : 01 q2 01 q3 01 P3 :02 P U 01 quit :01 Think & Calculate Answer Trouble 3:51/7 :10/6 • 1:27/2 .01 2:50/6 :l8/6 :08*U :37 :01 2:38/5 :5V5 1:39/6 :05A 6h Log Annotations for SB practice exam 8:12:37 Spent 1:38/U selecting problems, :32/U waiting for display, 2:28/8 in reading and answering, and an enormous 5:^6/3 trying to figure out what to do. U:05 of the latter was because proctors did not yet understand that the syntax PG/G required an acceptable character after E. Subject had entered "E ." to indicate that the dots used to represent blanks were extra (i.e., subject did not understand the dot convention because had not used the compiler). Another 1:26 of Trouble time was spent trying to figure out how to get out of the syntax problem. paper exam 8:27:1+9 pi SB was fairly slow spending times of :25, l:l6, :l8, :l6, and 1:U6; the last included :23 of calculation with pencil. In addition to pencil calculations, the subject used fingers to point to the problem and the variable definitions. The two long times included hard divisions: 18+U.5 and 15+^, respectively. Subject's only errors were to omit the sign from the third and fifth answers. : 31 : 31 ql There was no error and the text was a simple assignment Subject may have spent 1:28 thinking about it either because of worry that there should be an error or because unsure how to answer. ■ :20 q2 Problem was "ASSIGN 25 TO WUXYZIJH". Subject was very unsure, made several changes, and finally entered "E JH", which is correct but would have been rejected by the PG/G (in hand grading this was given one point). Subject also erroneously entered M R ASSIGN TO" so clearly understo'. e ASSIGN statement nor the R option on the PG/G. ( :08 was recoded from ei ost-test questionnaire Responding to question 11 as to whether the keyboard was an "annoying hindrance," the subject circled a response code for "somewhat." We asked verbally what the problem was and the subject mentioned only the trouble with the syntax problem; i.e., the "hindrance" was in the problem design rather than the keyboard. Verbal questioning about typing ability brought out the fact that the subject had had a typing class three years previous. In other verbal comments the subject mentioned having learned F FORMATS from reading, but not having learned I FORMATS (which explains the better performance on p3 even though it was harder than P3). 61 PQ CO u o Cm W •H -P •H s> •H -P O < O CO •H CO !>s _ CO J Ooo t- O _=J- H M • - • cu o| H LA CO H CM t- H O ft 1 1 H 1 H H C cd Ph H 1 1 Pi t— LA O H -H; t- LA CM cu o CM 1 H «H EH •H p On oj m_=r H CNI -3" vo Q ft all LA NHH CM CO CO o _* 1 ' 1 CM H -=* . .1 CO CM O On -=f o H o O LA ^— s cj| LA On oo oo t— OO H CM — ^ < -d H CM CM H LA C V M O CU o Pi CO VO H CO O H t— CO O ft cu LA CM CM LA H cd co *■*>» (H v_^ LA t~ ON CO ON H OO CO vo "^^. -3" CO CM CM CO o O H H ail -H- H CM H H -H- H H On O CD O t- t— o r- o o H CO_=f t- O ..i • • • LA o| -=f moj LA o H O CO CO 1 o LA vo ■ — «. PQ "-* H CM H H CM H H H rH -H; o u * Eh -H Pi LA H H CM -=f ao vi o 1 COCC H O 35 CO CO CM LA CM 1 H O (X, •— , ■^^ vo t— o co O o O O CM CM H CO _=r LA nil _=r H <-\ -X H -=r H CM H CM O co_d- H H On CM H -=>■ CM H -3- H On H H H CM CO 00 LA O CO O OO O o LA O o| H CO 1 00 CO vo H t— 1 LA O t- CM CO on CM H H CM H LA CM (U H CJ ""^ •H -P -p CO l-«l o _-* , -=r CO CO CO CM , CO H H OJ -=t CJ fc PI OJ cd -H Jh «h ft ^ -4- CO LA CO CO -= VO O t— LA CM eel CM LA CO .. o •■ CM J- O CO CM CM O O CO O H CM LA H t— H -— ~ CO .. — ^ cu cu *H ■ *~> bO .H CU cu cd -p ft CO bO ft -h cd cu cd t> ft -H ft M 'H ■p -p 08 -H U > o bO !> (DO cd CO d H -H > o •H 'H -P O H CO fl •H AJ o o o tt cd cd C U o -P a C co cd •H ■H •r\ 1> p ft cu CU 3 a o -h « >5-H -p CJ cd cd «H H +^ CO «H CU cu co -p cd U ft o t) to a -p i ^-^ O N cd a co cd C C ti o •H CU -H P •H cd -H 2 -P c c h a CJ bO t: -P bO tJ O O cu cd EH bO cd a o CU •H *H CO bO bO P CH'H t. H -P -P M a OJ ■H 2 ^ ft P o o m o •H rH rH ft id o cu 3 cu cu (U tJ £> p CO H cd H -P H O H H P H cd o O H cd cu cd C cd ^ cu -p cd o £ M cd 5h -P « O H -P EH corao-p 1-q ft -p O o O o o a EH EH EH EH CO CO •H •P •H !> •H -P CJ •H -P O <; M ft •H EH cdlpl ol 68 Detailed Log for Subject SC PLATO exam Wall Pr Think & Time # SeZ .ect Display Calculate Answer Trouble 8:1+U:01 PI : 02 03 U:19/li+ 12/6 :05 :U8:U2 0A 22/2 : 12/3*1*3 :H2 :01 :^9:59 Q2 03 17/3*1 1:10/3 .05/3 1:25 :52:59 Q3 02 16/3*1 1:30/3 :03/2 :11 :55:01 P3 : 28/3 07 1:35 :03 :5T:1^ PU 05 10/2*1 5:16/11 :15/9 :52/2 9:03:52 PI 17 0^ 1:27 :05:U0 01 lU/2 ,2l+/2*2 :39 :06:5T Q2 08 06 :23 :0T:3U Q3 : 02 08 :29 :08:13 Ql 05 08 :08:26 ?k 29/2 10 :05 :09:10 P3 : ok 13 1:08 :10:35 quit 10 paper exam Wall Pr Think & Time # Se] Lect Calculate Answer Trouble 9:12:U0 pl 01 3:23/11 :08/5 :l6:12 ql 01 :hk :02 :16:59 q2 02 1:28/6 :26/k :18:55 q3 02 :53/3 :03/2 :19=53 P3 02 :06 :20:01 P U 02 l:U3/5 :08/U :21:5*» P3 01 :02 : 21:57 •H -P O < Cm O CO •H CO u\ ^—^ H l/\VO oj -3- O ITN ON w o| • • • • • • • • l/\ JCMH t~- VO VO o O ft H 1 1 OJ H C Cd ft M 1 ^1 H t— ~=t -=t t— VO t- on 0) o rCll J- l H ft CO W) ft -H Cd CD CtJ > ft -H ft M -H P P o» -H u > o bO > cd o cd CO C H -H > o •H -H -P O H to C -H >* CJ CJ o X aJ M O P C C ID 0) cd d •H -H •H M a -p c o P P !>> !> J3 ft CD CD S c o -h > > cd w H P CO -H CD -P M g rQ O CJ m o •H H H ft ■rt o 0) 3 CD fd rd £> CO H cd H P H o HH^H cd O O H ct3 cd C cd u p> cd O ftl ft P M P ft O W P Eh C/J. CO O P J O O O o O CJ EH En EH EH CQ CO •H P ■H > •H P a < •H -P O < U ft •H Eh En S II II II cd|,a| o| 72 Detailed Log for Subject SD PLATO exam Wall Time 36:U5 39:32 U0:29 1*1:21 1+1:57 1+2:20 1+5:1+7 U8:29 50:32 51:31+ 51:53 52:55 5l+:00 5l+:28 Pr # PI Ql Q2 Q3 Ql P3 Pl+ PI Ql Q2 Q3 P3 Pl+ quit Select Display 13 22/2 03 08 01 21+/3 11 13 16/2 05/2 07/2 22/3 08 09 09 19/3*1*3 13/3*1 13/3*1 08 08 10/2*1 02 2l+/2*2 09 08 08 09 Think & Calculate 01/8 15 35 11+ 50/10 1+5/6 1+3/5 22 05 1+3/1+ 35 11 Answer 21+/6 01 01 01 37/8 H+/5 05/3 01+/3 Trouble 11+ 28 22 paper exam Wall Time 6 2U:17 27:1+1 26:22 28:52 30:1+2 32:12 35:33 Pr # Pi ql q2 q3 P3 P U quit Select 02 02 03 0U OU 03 05 Think & Calculate Answer Trouble 3:09/2 09 :0l+ :25/2 02 :12 :25 02 1:00/1+ 18/1+ :28 1:05/1+ 21/3 2:51/9 27/9 73 Log Annotations for SD practice exam 8: 18 Q2 Had considerable trouble trying to return from syntax problem to exam cover page (shift-NEXT would return to the same problem because there was only one). Spent 1:1*7 in productive work and 2:03 trying to figure out what to do next . paper exam 8:2^:17 pi Spent 2:1+6 calculating all answers with pencil (not on exam paper though) then :0U inquiring how to enter answer and :09 entering all answers at once. Made three mistakes, all by wrong sign; two were wrong sign on final answer and one was wrong sign for intermediate result. : 27 : hi ql The "trouble" time here was a question as to whether the syntax errors might simply be that the text was in the wrong columns (they were not). In fact, ql and q2 had no syntax errors. :29:50 q3 The :28 trouble time was uncertainty as to how to fill out a "replace" error response. Eventually did correctly. Next subject entered "Missing limit," but correctly erased that. :32:12 pU Of the nine answer entries, seven took one second and two took ten. SD started to respond with values of 1, 2, 3,..., but then spontaneously noticed the increment field. Had SD been interactively graded, the error message for the second row might have caused more worry and concern than enlightenment. Got the third line wrong because an array value it used was changed by calculation for second line. Failed to specify "sob". Ik PLATO exam 8:36:1+5 PI Did :32 calculation with paper for first expression, "but did the rest in head. (May have worked on more than one problem while calculating on paper.) : Ul:57 Ql The second visit to Ql was unwanted; a press of NEXT from the syntax cover page takes one to it. This could be recorded as an additional :09 of Trouble time. ;!+!+: 02 P3 SD required verbal explanation of how to enter lines by first entering the line number where they were to go. : 1+7: 06 Pl+ Pressed shift-BACK when should have pressed NEXT to terminate the answer and have it accepted. Subject recovered, entered the answer correctly, and got full credit. : 1+9 : 02 PI Again SD calculated answers with pencil (:23); also changed some answers. :52:08 Q3 Changed answer. No change to P3 or Pl+ during second visit. 75 Q CO U O co 0) •H •P •H > •H -P CJ < Cm O co •H CO S»s H cd d < On 4 1AH la VO H CO ^ nl co eg co h -H; vo _=* CM w C| 1 1 CO 1 1 cu H O ft | d cd 0) ft Jh 1 Pi CM VO H ON vo o LTN LTN 0) o VD H CM H H cu H H H CM H H H o ^ •H -p -p w Pi t— LTN . CO CO LTN C\ CVI CM H H LTN t— O M CM 1 H cd -H M ft -H ft M •H -p CU ■P 08 -H M t> O bO > cu o cd CO C H -H t> o •H -H -P o H CO d •H >J a o o x cd U o -P d d to cd cd a •H •H ■H M B -P a o -P -p >. !> P ft cu cu 1 £ O -H o cd cd -H ■p— > > •H -P cd h d ^ CO -H CU . cd ft -P o O O O o o EH EH EH EH CO CO •H ■P •H t> •H -P O < O rH P -p •H > •H -P O <: u ft •H EH II II II cd|,o| o| 16 Annotated Log for Subject SE paper exam (8:27-8:1+1) pi Although subject did paper calculation on second subproblem, the answer given was inexplicably IT -7 instead of the correct 1.0. The answer -3.0 for the third subproblem might possibly be by confusion between the letter I and the digit one. For the fourth subproblem SE answered "27" where the answer should be REAL and 3; this is probably due to wrong order for double exponentiation. The incorrect -.75 given for the fifth subproblem was probably due to missing the sign on the operand. q2 Gave the answer "E H J" which is the best possible, but we graded it zero as would the PG/G. (Had it been a video-taped exam, we would have given it one point when we went through and regraded this problem for further analysis.) p3 Without comment, we present the subject's response: 1 F5.2 FU.l F3.0 7.3862 7.85 8.7U pU Gave answers for 1, 3, and 5, but later went back and corrected it by erasing the middle line (it is skipped by the IF... GOTO). Also changed first line from 1, 6 to the correct values of 1, 2. Subject would have received no credit from the interactive grading scheme. PLATO exam (8:U2-9:06) Subject had trouble changing the answer to the syntax problem. od to correct an answer (ERASE is permitted until the line is finisher]), . Instructions confusing. 77 9:06 Spent ten seconds staring at the page intermediate between leaving the cover page and actually quitting the exam. This was either spent trying to understand that yet another keypress was needed to really leave the exam or trying to review the problems on the exam. post-test questionnaire 12 Verbally the subject reported having typed a number of papers. 13 SE reported not feeling a difference in reading problems from the screen; and yet spent ten minutes longer on PLATO. Subject further mentioned that there was trouble switching problems, which could account for the longer time. Ik Subject did prefer exams on PLATO because they are "more interesting," and "more exciting." 78 Annotated Log for Subject SF We took a fairly extensive log of SF's PLATO work and will present much of it here. Even though it does not reveal any particular anomolies, it does illustrate the level of analysis that can be conducted without videotape, practice exam ( 8 : 36—8 : UU ) 8: 36 Cover page :36:30 PI :U0:25 Cover page :h0: 33 Selects syntax cover page. Character loading begins. :k0: 50 Syntax cover page :Ul: 30 FORTRAN syntax directions (few students took this option) :U2 :00 Syntax cover page :U2:U5 Ql :UU:QQ Syntax cover :Uk :Q5 Cover page PLATO exam (8:U6-9:02) 8:U6:10 PI Thought for awhile; gave answers very fast. :^9:55 Cover page :50:05 Syntax cover page :50:08 Ql (no error in problem) :'A :00 0,2 (no error in problem) ( ',here were errors, but SF found them) s$3 s 0! '-ax cover page 79 8:53:20 Ql Review (The log says "review," but the subject was probably just having trouble getting to P3.) :53:25 Syntax cover page :53:^0 Cover page :53 :U8 P3 Studied page for two minutes before typing. : 56:15 P3 Finished :56:28 Cover page (Presumably it took thirteen seconds to display.) :56:U0 PU 57:35 P^ Entered first answer. 58:15 P^ Entered second answer. :58:30 Pk Made an incorrect entry for third answer. 9:01:05 P^+ Entered third answer correctly. :01:15 Cover page :01:30 Quit :01 :U0 Quit (Pressed last exit button.) paper exam (9:08-9:18) pi Lost three points by incorrectly copying correct answer from work area to answer area. (Gave answer as -3.15.) q2 Gave answer "E H J" and received zero credit. p3 Failed to round twice and used wrong format for third answer. (Had "8.7") pU Wrote correct answers but did not indicate the end of output as would have been required on the PG/G for full credit. Got full credit anyway. 80 Annotated Log for Subject SG practice exam (8:15-8:26) Did the two arithmetic expressions in two minutes, "but spent eight minutes looking for syntax errors in "8l ... .CONTINUE" . paper exam (8:28-8:1+9) pi The only calculation written on the exam is "-15 . /U." for the third problem. All answers correct. p2, p3 All answers correct; no other marks on paper. pU Modified the values in the array as the computation proceeded. Got all answers correct (even though one iteration modified an array value used by the next). PLATO exam (8:51-9:06) 8:51 Begin at monitor : 51 : 30 Cover page :52 PI Motionless while thinking of answers. Typed answers very fast. : 5^ :30 Corrected wrong answer. Appeared to be disturbed by the sound of the machine (probably the videotape machine and the nature of observational situation ) . :55 :15 Cover page :55: 35 Ql No syntax error ; answered very fast . -Q3 Was observably confused by the instructions. Did do Q3 in only 0:15. 81 8:59 P2 Reviewed answers 9:00:05 Syntax cover page :00:15 Cover page :00:20 P3 Was confused by the replace-a-line scheme. :02 Cover page :02:30 PU Worked with paper and pencil some. Answer entries were at :03:55, :03:58, :0U:20, and :0U:30. : 0^:U0 Cover page :05:00 Revi ew :05:35 Detected and correctedi.an error. 05:^0 Finished post-test questionnaire 12 Wrote on the questionnaire that typing ability influenced performance by "increased speed + fewer clerical errors only". "bliocraphic data IEET ZTu\e jnd Subtitle |. Report No. UIUCDCS-R-76-782 DETAILS OF AN EXPERIMENTAL VIDEOTAPE EVALUATION OF AN INTERACTIVE EXAM SYSTEM ] Auihor(s) Richard Poring, Lawrence R. Whitlock, Wilfred J. Hansen • Performing Organization Name and Address National Science Foundation Washington, DC 3. Recipient ient's Accession No. 5. Report Date December 1976 8. Performing Organization Rept. No ' UIUCDCS-R-76-782 10. Project/Task/Work Unit No. I Sponsoring Organization Name and Address Department of Computer Science University of Illinois Urbana. Illinois 61801 11. Contract /Grant No. EC41511 13. Type of Report & Period Covered Research , 14. 5. Supplementary Notes J. Abstracts This report provides details of an experiment conducted at the University of Illinois. An analysis of the data is contained in a companion paper (UIUCDCS-R-76-836) about the student performance on an interactive examination. 7. Key Words and Document Analysis. 17a. Descriptors PLATO system automated computer science education system I7b. Identifiers/Open-Ended Terms 17c. COSAT1 Field/Group 18. Availability Statement unlimited 19. Security Class (This Report) 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 85 22. Price FORM NTIS-3S ( 10-70) USCOMM-DC 40329-P7 1 ft* "!>•*, otf !) He 111