VP-SE Research Group (C)

Courseware CAD

Bertrand Ibrahim, Alain Aubord, Birgit Laustsen, Michael Tepper

IFIP fifth World Conference on Computers in Education, Sydney, Australia, 9-13 July, 1990.

Stream: Research on Educational Application of Information Technologies.

A Postscript version of this document can be found here.

Abstract:

Even though the creation of Computer Aided Learning software (CAL) is always done on a computer, most development phases are still usually done with pen and paper. Authoring systems generally tackle the "coding" phase, leaving aside the specification and design phases and often imposing rather severe constraints on the users of such systems. The specification and design of courseware is not an easy or very formal process. Computer Aided Software Engineering (CASE) can nevertheless be very helpful in such an endeavor. We present here a whole development environment that runs on a graphic workstation. It allows a team of teachers and pedagogues to specify lessons graphically and then makes it easier for the team of programmers to translate these specifications into programs that can run on different target machines. The specification uses a very simple formalism and the programs are generated in a general purpose programming language such as Pascal or Ada. This development environment includes, among other tools, a graphic script editor, an automatic program generator, a synchronous multi-window editor to hand-code the parts of the CAL program not generated automatically and a translation manager to help maintaining translations of the lessons in different natural languages.

Keywords:

Software engineering, Automatic programming, Graphical specification, Development environment, Courseware, Computer aided learning, CAL, Computer based learning, CBL.

1. Introduction:

The most common approach to the development of computer aided tutoring is to use an authoring system. The problems with such a tool are that it is restrictive in what the developer can do and that it is tailored for individual work. Recent research (Karr87) shows that general purpose (high level) programming languages are still seen as the state-of-the-art for large scale quality courseware development. Such large scale projects are the result of a collaboration between teachers, pedagogues and programmers altogether.

Unfortunately, until now, very few tools exist to support large scale courseware development using general purpose programming languages. Moreover, very little has been done to support the whole development process of such software. There are however software engineering techniques (Glin84, Raed85) already used in other fields of software development that can very usefully be applied to courseware development.

Software CAD (Computer Aided Design) is an emerging paradigm of these very last years (Smith86, Buhr89). It refers to automated software design techniques centered around wiring- diagram-like graphical representation of software, with a supporting environment that allows for automatic program generation and fast prototyping.

The project we describe in this document (Figure 1) is based on this paradigm and is the result of a long experience in developing large computer aided learning (CAL) applications in a rather traditional way. This project intends to tackle the software development life-cycle of CAL lessons in order to allow the designers to concentrate on the pedagogical aspects of the lessons.

From the very beginning, we made the assumption that the development environment didn't need to be the same as the runtime environment. As a matter of fact, the needs of the designers and coders during the development phases are quite different from the needs of the learners using the final product. It follows from this that the development environment is targeted to run on a large screen graphic workstation when the environment of the learner is more modest and runs on PC- like machines.

Since teachers do not necessarily have a background in programming, it is essential that the tools made available to them to specify and design a lesson be as simple and close to their "universe" as possible. There are different phases in the development of CAL software and teachers or pedagogues are needed in only a few of them. They shouldn't need to be involved in the programming phase and yet a certain continuity is needed from the specification and design phase, then the programming and debugging phase, to the field tests and final use, including reviews all along the development process.

This continuity is ensured by a document that describes in a very detailed way what can happen during the learning process. This document, we call it a scenario or a script, can be considered as a semi-formal specification of the software that will be built thereafter.

2. Design phase:

The formalism we use in a scenario has been developed by Prof. A. Bork's team (Bork86) at the University of California, Irvine. It allows a complete and detailed description (partially in natural language) of a lesson. Since a good CAL lesson is mainly a dialog between the learner and the computer, the major building blocks of the scenario deal with program output, user input and how the program should respond to user input. All other operations are indicated in free text as instructions to the coder.

This specification formalism is a kind of graphical language. It is precise enough to give an exact description of the behavior of the final product, i.e. the lesson. It is built on just three basic primitives: text that has to appear on the screen, instructions to the coder (in natural language) and predicates (generally corresponding to answer analysis criteria). The specification of a lesson can thus be represented as a directed graph, in which each node corresponds to one of the three basic primitives and where the edges indicate the sequencing of operations (Figure 2). Text to be displayed is written in an ellipse, instructions to the programmer are between curly brackets and predicates are inside rectangular boxes.

The quality of a CAL lesson highly depends on its non-linearity, i.e. its ability to adjust its behavior to the learner's profile. This non-linearity is achieved by using cascades of tests (predicates) to be applied on the learner's answers to questions or on his actions following a task assignment. For more non-linearity, edges originating from a test box are labeled with a number or an interval. At runtime, a counter is associated with the test box. This counter will be incremented each time the criterion specified in the box has been satisfied. The corresponding edge will be followed depending on the value this counter has. Complementary predicates can be put in sequence (like IF THEN ELSIF instructions). If the criterion is satisfied, an edge on the side is followed; if it isn't satisfied, the criterion just below (if present) is then evaluated, and so on. If none of the criteria apply to the user input, the edges originating from the bottom of the last box indicate the actions to be taken in such an unanticipated case.

This formalism does not prejudice who, of the learner or the computer, has control of the other. It has been used to specify very different kind of lessons, ranging from mastery learning tutorials to microworlds with embedded tutoring as well as self-instructional software.

In our environment, a scenario is built using a script editor. The script editor is essentially a graph building editor in which the basic primitives correspond to the formalism we have just described. It is used to interactively enter the detailed specification of a lesson.

Using a mouse and keyboard interface, the designers specify each step of the lesson one by one, edit the textual content of each step and add or change sequencing links as they wish. These links are shown on the graphical representation as arrows, but are not considered by the script editor as simple objects. The user of the script editor can indeed use them to move along the script. He just has to select an arrow on the graph and pick the jump to arc origin or jump to arc end items in the menu to display what is at the beginning or end of a given edge.

As this example shows, the script editor is not just a simple graphic editor. It has knowledge of the structural aspect of a script and, as such, disallows most meaningless operations and offers a few primitives that help the designers check the consistency of their work. There is, for instance, a validation primitive that checks among other things that every step of the script is reachable and that there is no gap in the labelling of the edges originating from a test box.

3. Prototyping phase:

As we said earlier, the specification we get from the design phase is not purely formal. It includes what we call instructions to the coder that are written in natural language, without any restriction on what they can contain. It is therefore out of question to hope that the environment will ever be able to generate automatically the code for all such elements of the specification. Answer analysis criteria may also be specified in a rather informal way and are thus difficult to generate automatically too.

It is however relatively easy to generate code for the formal part of the specification, i.e. the messages and the sequencing of operations. The environment includes for this purpose an automatic program generator that translates the script into a program written in a high level programming languages such as Ada, Modula 2, UCSD Pascal, or Turbo Pascal. This tool can be seen as a graphical compiler that transforms a graphical representation of a lesson into an executable program. The target language can be chosen according to what is available on the target machine on which the code will be running.

Since the specification is not purely formal, the automatically generated program does not totally implement it. However, this is not a problem. A node of type message is translated into a call to the text display primitive with the name of the corresponding message as parameter. A node of type test is translated into a call to a boolean function whose body will be defined in a separate module. A node of type instruction to the coder is translated into a call to a procedure whose body will also be defined externally in a separate module.

Even though the body of a procedure corresponding to an instruction to the coder will later have to be "hand-coded", the automatic program generator initially generates a dummy body that displays on screen the text of the instruction. Similarly, the body of a boolean function corresponding to an answer analysis criterion will initially prompt the user for the value the function is supposed to return.

This means that the automatically generated program can already run as is on a target machine without any need for coding. Its behaviour will not fully correspond to the specification, but it will already give a rough idea of how the final product will look like.

Running this program on a target machine allows the designers, already at this early stage, to review the text as it is displayed on screen, to work on and tune the screen design and to test the different paths the learner can follow depending on his actions or answers.

As an alternative to program generation, we have contemplated a direct interpretation of the script on the development machine. This solution looks appealing but it is however not as easy to implement as it might seem at first. It indeed implies among other things that one emulates the screen and window management of the target machine on the screen of the development machine.

On the other side, the choice we made, i.e. generate source code that will be compiled on the target machine, has the advantage of integrating the prototyping and coding phase.

4. Coding phase:

The code produced during the prototyping phase includes dummy procedures and functions in separate modules corresponding to the instruction to the coder and to the test elements of the script. The task of the programmers is then to replace these dummy bodies with code that really does what the designers asked for.

One major problem in using code produced by an automatic program generator is that this code is hardly readable by a human being. Most of the symbols used in the code are generated by rather simplistic algorithms and are thus quite cryptic. Isolating the parts that have to be "hand-coded" is already a step in the good direction since the programmers will not have to deal with the automatically generated code.

The use of a modular target programming language is essential here. It allows a complete separation between the automatically generated code and the code added by the programmers, so that the programmers do not need to modify the code produced by the automatic translation.

Programming a procedure when one is not able to see how it fits in the whole application is not an easy task. The programmers can therefore use a specific tool of our development environment to write their code. This tool, a synchronous multi-window editor, allows a programmer to complete, in one window, the automatically generated code corresponding to the part of the script that is showing in another window. The idea is to have different windows show the different aspects or views of the same thing : a regular text editor in one window to edit one of the separate modules that have to be hand-coded and the script editor in another window to show the corresponding specification.

Any script window can be synchronized with a module window, or vice-versa. The script window is read-only in the sense that one can only issue positioning commands in it. One can enter positioning commands in any of the windows and ask for the other window to "synchronize", i.e. show the corresponding part of the view it handles. The programmer can thus very easily see, write or modify the code corresponding to a specific part of the script, by finding the location in the script window and then asking for the module window to synchronize, i.e. show the code for that part of the script. One can also find a specific location in the code (e.g. where an error occurred) and then ask for the script window to synchronize, i.e. show the part of the script that is the specification of the corresponding code.

One can have more than one pair of script-program windows at the same time. With this tool, the programmer can always see the specification at the same time as the code he is working on. This specification (the script) should help him understand and manage the code, since it can be seen as a human readable documentation of the code.

5. Debugging and maintenance phase:

Another problem that could arise from the approach we have taken is related to the fact that the development machine is not the same as the machine on which the CAL program will run. If there is an error in the code written by the programmers, it is on the target machine that it will show up.

For this delicate phase, the environment includes a remote supervision tool. It is a program that runs on a development machine at the same time as a lesson is running on a target machine connected to the same local area network (LAN). This tool is used to follow on the development machine the execution of the CAL program. To do this, the automatic program generator has to include, in the code it generates, instructions to send on the network. The informations sent indicate which program is running, which step of the script is currently being executed and possibly the name of the learner (e.g. for automatic curriculum).

The remote supervision tool can thus directly show on the script (which is a human readable specification of the lesson) where in the lesson the target program is. It allows a programmer to find much more easily where in a lesson a given error occurred. The synchronous multi-window editor can then be used by the programmer to move directly to the code that generated the error and correct it.

A similar mechanism makes it possible to collect automatically statistics on the learning process, once the CAL program is operational. The analysis of these statistics is useful for the designers of a lesson to review the pedagogical design. This review can then be carried out with the script editor.

6. Translation phase:

In a country that has four official languages it appears natural for us to want to have the dialogues of the CAL programs we develop be translated in different languages. That is why there is in our view another rather important phase: the translation of the dialogues (Frank85, Ibra86). Indeed, all the CAL software we develop is intended to be easily translated into different languages using the Latin alphabet (with some variants for accented letters). This translation phase can happen before, as well as after, design reviews. Therefore, it implies using an editing tool that also has the functionalities of a version manager to handle a database of updates.

We indeed make the assumption that there will not be a canonical language in which the design and all the reviews would be made and from which translations will be made. It may in fact very well happen that the design be done using, for instance, English, then have the lesson translated to French and Italian. Any of these three versions can then be edited with the script editor because some field test has shown a modification to be desirable. The problem is then to make sure that all versions are up-to-date.

In order to help to maintain the coherence of the different translations, a multi-window editing utility manages the modifications that are made to the messages. This tool maintains a database of modifications for the different message files, allowing the person(s) in charge of the translation to find directly the places that need to be updated, showing in a window the original message in the initial language, in another window the updated version of the message for this same language, in a third window the translation of the original message and, in a fourth window, the user can enter the update for the translated message. Only the fourth window is active, i.e. it is the only one in which the user can apply a modification. The other windows are passive (read only), but all four windows are synchronized and moving to a specific message in any of them generates a similar move in the other ones.

This tool is separate from the script editor since it has to be used by different persons. The translations will indeed not be done by the original designers, but rather by people proficient in both the original and target languages. The translation manager has nevertheless to be coupled with the script editor in the sense that it has to know which changes have been made with the script editor on one version of the messages in order to be able to indicate to the translators where to make updates in the other versions.

7. Future extensions

Even though the development of this environment is not yet complete, we are already working on extending its capabilities. As the reader may have noticed, the specification formalism we use is open ended, in the sense that it allows the designer to give part of the specification in natural language (as instructions to the coder). In the first phase, we designed the automatic program generator so that it takes care only of the display of the messages and of the sequencing of the operations. The instructions for the coder are, for the moment, translated into a procedure call and the answer analysis criteria into function calls. These procedures and functions are in a separate module and their body must be written by a programmer.

In a second phase, we intend to augment the translation capabilities of the automatic program generator by using artificial intelligence techniques, i.e. knowledge acquisition and generalization techniques. It will try to detect similarities between different instructions to the coder and query the programmer for a predicate, allowing the generator to associate specific parameters to each instruction to the coder.

Another extension we have in mind for this project is to use groupware techniques for the design phase. We indeed generally have more than one designer work on the development of the same set of lessons at the same time and we often have different people do the reviews. It would therefore be useful if the environment could support such team work either by allowing different people on different workstations to work on the same script at the same time (distributed work) or different reviewers to annotate the original script without changing it in the absence of the designers (asynchronous work).

8. References

Site Hosting: Bronco