This paper was invited by the Information Technology Institute, Singapore, to form part of their HCI feature articles. In due course it will probably become available via their website (follow the link to Special Feature and then, if necessary, to Previous Features). Meanwhile, the APA Guidelines suggest it should be cited as:
MODELLING IN OSM - AN OVERVIEW
INFORMAL REASONING FROM AN OSM
FORMAL REASONING FROM AN OSM
TESTING OUR CLAIMS
Computer Based Learning Unit
School of Computing Science
We introduce a new HCI modelling technique, 'Ontological Sketch Modelling', which identifies certain types of usability problem by exposing misfits between the user's conceptual model and the conceptual model imposed by the device. These types of problem are not addressed by previous techniques of HCI analysis. To make the approach widely accessible and easy to use we are developing an 'OSM editor'; after describing a device using the editor, the user can obtain a computer-generated analysis containing 'usability alerts', which warn of potential misfits.
Meeting a new application or a new IT device for the first time can be like starting to watch a soap after missing half the episodes. What are all these buttons and lights? What do their labels means - TD, TA, RDS, and the like? How does it all make sense and how does it relate to what you already know?
Then, when you understand the basic functionality, different problems arise. You've got started - you've drawn some curves, or started to organise your home finance, or begun your new song. Fine stuff, but you want to make a small change - and now is when you find out how usable your new gizmo really is. Perhaps you change one of the points on your curve and everything else moves around in apparently unpredictable ways. Or you discover that to make a small change to your home finance system, you'll have to do so much work that you might as well start again. Or maybe you find that although you can change the tune and the chords you've started to compose, it means understanding a whole new set of obscure tools.
Conventional approaches to HCI concentrate on how fast you can get the buttons pressed, or whether the system messages are confusing. We want to find a way to go deeper.
We are pioneering an approach to HCI that concentrates on the user's conceptual model of the device -- and of the domain as well. Analysing the misfit between these can reveal potential problems in learning and use, problems of a semantic type that are not revealed by existing HCI approaches. Here are some common types of misfit.
Inexpressiveness Definition: the user wishes to do something that is part of the conceptual domain but cannot be expressed using the device Example: electronic organizers cannot indicate relative importance of engagements, nor links between them (see below)
Indirectness Definition: the user thinks of a single direct operation on a conceptual entity, but the device requires an indirect operation.
Example: trying to lay out graphics aesthetically, command-based interfaces are indirect while direct manipulation interfaces are direct - BUT trying to lay out graphics in predefined positions, the opposite is true!
Viscosity Definition: the user thinks of a single direct operation but the device requires an indirect operation, or a large number of operations
Example: updating section and figure references in a standard word-processor
Premature commitment Definition: the user wants to do an operation but, before it can be done, the device requires a commitment to something that can only be guessed at, or that can only be determined by lookahead.
Example: starting to draw a family tree or a map, the first mark must be made on the paper before knowing how far the map will extend.
Second example: using an algebraic calculator to solve a problem stated in words, the user has to look ahead to discover whether parentheses will be needed at the beginning of the calculation.
Misfits cannot be revealed by any approach to HCI that focuses solely on either the user or the device. Traditional task-centered user-modelling for HCI has some very effective results, but it cannot reveal misfits because it does not explicitly consider how the user's domain model relates to the domain model imposed by the device.
In this paper we shall introduce Ontological Sketch Modelling (OSM). The idea is that the modeller describes the entities that are visible, and their attributes and how they are linked within the device; and also describes the entities contained in the user's conceptual model. The resulting entities may be private to the device (the user cannot alter them), or they may be private to the user (the device does not know about them), or they may be shared (accessible to both the device and the user). All communication between the two worlds of user and device takes place through the shared entities. If the user-private entities do not fit well onto the shared entities, the device will have usability problems.
Aspirations Ontological Sketch Modelling is still a developing approach, but we hope that it will have the following virtues; if we meet all our aims, we believe it will be an approach that is useful and usable. * OSM will be easy to learn and easy to do, because it is directly concerned with entities and concepts. Traditional task-centered approaches to HCI are harder because they are indirect, like trying to describe a teapot by describing the tasks you could do with it. * OSM will avoid 'death by detail'. Many HCI techniques generate a huge mass of details: OSM is succinct. * OSM will reveal problems of a different sort from traditional task-centered modelling, because it focuses on fit or misfit at the conceptual level instead of on the surface features of devices. * OSM will lend itself equally to informal pencil-and-paper modelling and to computational analysis of formal models.
Background Our approach rests on a number of points that have been established in previous research by ourselves or others.
Sketchy models are needed There is a place for detailed analytical models, but they take time to apply and time to learn, and the high time-investment has deterred designers from using many existing HCI techniques (Bellotti, 1989). At present, the few techniques that are quick to use focus on surface features, not on deep problems.
Misfits can be identified Misfit analysis has been attempted at least twice in the HCI literature (although not by that name), but has not become a strong tradition. Moran's ETIT analysis (1983) mapped the 'external' task of the domain onto the 'internal' task of the device, from which an efficiency metric could be computed, essentially the number of device actions required to achieve one domain-level goal. Payne (1993) drew on ETIT for his 'Task-Entity Analysis' by which he explained the low usage of early electronic diaries and calendars.
"A task entity analysis begins with an enumeration of the conceptual objects in the task domain and their interrelations and inspects the degree to which these entities and relationships can be represented in the device." (Payne, 1993, p. 95)
Intentions to do things formed the main class of things-to-be-remembered. Intentions are nestable ("an intention to phone a colleague is part of a broader intention to organise a conference") and they have dependencies ("you cannot book the conference dinner until you have an idea of the expected number of delegates"). Some intentions are more important than others, and some have to be performed at a precise time while others merely have to be done sometime.
Payne found that users of paper diaries could make use of variations in writing size, scribbles and arrows, and vaguely-specified times to convey all those different attributes of intentions. Electronic diaries were unsatisfactory because their entities and attributes were too limited to express the user's conceptions of the domain - there was no way to indicate all the subtle but important differences between types of intention that he identified. As a result, even people who used electronic diaries also used paper ones as support.
Payne's approach was entirely informal and could not be attempted without knowledge of HCI and cognitive psychology; also, it was not powerful enough to yield alerts for viscosity, etc. Nevertheless, explaining the weakness of electronic diaries by relating conceptual entities to the expressiveness of the device was an important result.
Identifying misfits suggests design improvements The best-developed work on misfits appears to be the 'cognitive dimensions' framework developed by Green and others (Green, 1989, 1990, 1996; Green and Petre, 1996). These dimensions, including terms such as viscosity, premature commitment and other examples described above, are easy to understand, and they describe the real difficulties that users talk about and complain about. By providing a richer language in which to express user problems they create 'discussion tools' which allow users and designers to communicate at a higher level than merely describing the surface details of an application.
Moreover, thinking in these terms shows up the trade-offs between dimensions. Designers can alleviate viscosity by introducing new 'power tools', such as style-sheets in word processors, but by doing so they increase the abstraction level of the device. If the new user has the option of working the device without having to master the power tools, then the result will be quite good; but if the new user has to master the power tools at an early stage of learning, then probably the entry cost of getting to use the device will be a serious deterrent.
Experience in several domains has shown that the terms used in the cognitive dimensions framework are comprehensible and that designers can gain a better understanding of possible user difficulties from analysing their work in these terms; moreover, they can find themselves prompted to redesign their systems in the light of their analyses (Yang et al. 1998).
But the cognitive dimensions framework is not a complete answer. If traditional task analytic approaches are too detailed for some purposes, cognitive dimensions are too broad for some purposes. If traditional HCI is too highly precise, cognitive dimensions are too undefined and intuitive.
The Ontological Sketch Model is meant to stand midway between these poles. It is precise but not over- detailed. It yields analyses of some of the cognitive dimensions, but does not try to achieve complete coverage.
User knowledge can be modelled The attempt to characterise the user's knowledge and to reason about the consequences of that knowledge is also found in the work on Programmable User Modelling (see, amongst others, Blandford and Young, 1996). PUM concentrates on fine detail analysis of the user's knowledge of the interaction device, producing computational models written in Lisp or Soar which simulate the user's cognitive processes in reasoning out how to perform a task.
Like almost all modelling approaches in HCI, PUM is task-centered, but it is unusual in being an executable model. While the results are impressive within their scope, it must be noted that the fine level of detail required makes it extremely tedious to construct a PUM. Moreover, although specific problems can be identified with particular designs, there is no easy way to generalise the problems identified, because there is no classification scheme to which they can be referred. In contrast, the approach taken in OSM analysis, although more limited in what problems it can identify, allows the problem to be classified and therefore is capable of prompting the designer to choose one of the standard remedies for such a problem.
Nevertheless, the PUM work is an important part of the background, because it has shown that HCI can model users at the knowledge level, rather than at the action level that characterises so many approaches to HCI.
Entity modelling reveals misfits Task-centered models such as GOMS do not reveal misfits, but the analysis performed by Payne (1993) introduced the idea of task entities rather than task procedures. Green and Benyon (1996) applied entity- relationship modelling to HCI, modified to include some of the important aspects of information artefacts, in a scheme called ERMIA (entity-relationship modelling for information artefacts). Entity-relationship modelling, a technique long practised in the design of information systems, lists entities and their attributes and the relationships between those entities. It is a relatively weak expressive system which can nevertheless help to ensure that records are adequate and that information can be found when needed.
In Green and Benyon's modified version, the models included conceptual entities as well as device entities. By this means some types of misfit could be identified, although not all the types that OSMs can identify. On the other hand, ERMIA can produce results that are beyond OSM, such as estimates of the 'cost of knowledge' (originally defined by Card et al., 1994) and the 'cost of update' (a type of viscosity misfit).
On the plus side, the ERMIA analyst could choose the preferred level of detail in a way that was not possible in many of the other HCI modelling approaches. We have succeeded in preserving this important characteristic. On the negative side, although ERMIA was a successful language for collaborative modelling (Whitelock et al. 1994), it forced the analyst to work in terms of very abstract relationships, which is not easy.
Expert explanation starts with entities At this point we turn to the activity of modelling, rather than the contents of the model. Making a model can be compared to explaining a device. Recent research (Cawsey, 1993) shows that when experts give explanations of devices they frequently start by identifying the device type and then go on to list the components, describing their constituents and their functions, possibly at some length, ending with causal- event descriptions at the process level.
Modelling in OSM takes almost exactly that course. The entities are described and their constituents. Causal events are modelled as dependencies. Moreover, those entities that are part of the device itself are likely to be visible and will therefore prompt the modeller; while those entities that are part of the conceptual domain are at least likely to have well understood names.
In contrast, other modelling techniques are much less direct. Task-centered models obviously require a task analysis. Doing a good task analysis requires practice. Modelling internal relationships at an abstract level, as in ERMIA, is downright difficult.
OSM draws on the sources above by modelling entities rather than tasks, by adopting a deliberately sketchy representation, by representing the features that lead to misfits, and by allowing the level of detail to be chosen by the analyst. OSM models represent entities (and their attributes), constraints between entities, and to a lesser extent the actions that affect those entities. The most important feature of OSMs is that they include both domain entities and device entities.
Entities are linked together in several possible ways. Some links are dependencies: changing an attribute of one entity may cause a variety of changes (e.g. adding a word to a document may change its length, and may cause the pagination to change, so the table of contents may no longer be in synch, and so on). These dependencies may cross the bounds between domain entities and device entities -- in fact, if the user is to be able to change the domain-relevant entities, the dependencies must cross the bounds, leading to device entities that affect the domain entities.
To see how this works, consider drawing a figure, such as a whale (which is what we asked our subjects to do in the experiment mentioned below). The whale is a domain-relevant entity, consisting of lines and curves; the lines and curves are themselves also domain-relevant, but unlike the whale itself, they have a device representation, so whereas the whale is user-private, the lines are shared. If the user wants to alter some part of the whale, such as its tail, then each line of the tail must be altered, because there is no shared entity that is made up of [all the lines of the tail]. So changing the tail will be relatively viscous.
Other links expressible in OSM include hierarchical composition, a simple form of inheritance, and constraint. Constraints are the most interesting, especially constraints that exist in the conceptual domain but are not inherent in the device. Nothing in a typical WYSIWYG word-processor constrains the figures to be numbered in sequence, but that is a constraint imposed by the domain, and it is one that is potentially quite time-consuming to fulfil, as is evident from OSM analysis.
Compared to writing with a word-processor, a drawing package offers rather few device entities and the user has to translate the domain-relevant entities into device terms. But on the plus side, a drawing package usually contains few dependencies between its entities: adding a line usually affects very few other parts of the drawing, so that the user can add, delete, or move individual lines very freely. In contrast, text can contain many types of dependency both within sentences and between them, so that when parts of the text are moved around the writer has to spend time repairing broken dependencies.
In short, drawing packages and word-processors have many differences at the fit/misfit level. These differences will affect their usability for different types of work.
The differences could be modelled in reasonably faithful detail by some of the advanced modelling and knowledge representation languages, but to do so would submerge the analyst in the 'death by detail' that we are anxious to avoid. We have therefore adopted the sketchiest of approaches that can reveal a reasonable amount about the dependencies and their possible consequences. There are many results that can be obtained from our sketchy level of analysis, but there are also some that cannot be analysed; however, to get deeper results would greatly increase the labour of modelling and would also greatly increase the training required to use our system.
One of the great benefits of our sketchy approach is that much of the activity of modelling can be performed by inspection. When the modeller thinks of an entity, they just write the name down, without deep analysis into its nature.
The purpose of writing out the OSM is to help the analyst to spot potential usability difficulties. While writing the OSM, or afterwards, the analyst can check for potential difficulties. These include:
The small study described below shows that informal OSM analysis is quite successful. However, at least some degree of expertise is required, and even though the approach uses sketchy modelling, some degree of reflection is required to find the usability features inherent in a design. For these reasons we also wished to develop an algorithmic approach that could be used in a simple mechanical fashion, at least for a first pass.
The apparatus of entities and attributes can readily be represented in more formal terms, and the usability properties listed above can then be represented as conditions on the formal representation. We have developed proof-of-concept programs to show that many of the usability conditions can be extracted algorithmically.
Our work has used Prolog as the formal vehicle. Because writing multitudinous Prolog assertions is slow and error-prone we have developed an 'OSM editor' in Hypercard (again at a proof-of-concept level) with a table-based interface, generating the appropriate Prolog assertions (see Figure 1). The usability alerts can then be generated automatically by scanning the Prolog model. For example, a usability alert for repetition viscosity would be generated if the program detected the following conditions:
there is an entity-attribute pair E(A) such that:
E is domain relevant
[and therefore the user may want to change it]
E(A) is not directly modifiable
[so the user will have to change it indirectly]
for each P(Q) that affects E(A):
P(Q) is modifiable
E(A):P(Q) :: 1:M
[so each individual P(Q) may need to be modified]
Our intention is to produce a modelling technique that is demonstrably usable and useful. At an early stage in the project we conducted an experiment using the first version of the OSM approach (subsequently revised, partly as a result of the experiment). Two drawing packages were compared, ClarisWorks and JSketch, in a study conducted with a group of 20 final-year undergraduate students who were enrolled on a module on HCI and Graphics (Blandford and Green, 1997).
To assess the usefulness of the OSM approach, we did an OSM-based usability analysis of each of the software packages we were using, and then compared our predictions against empirical data. Students were put into pairs, matched as closely as possible for prior experience. Each pair was allocated to one of the two drawing programs, one partner making a drawing and thinking aloud, while the other partner noted what difficulties were encountered. Students were asked to draw a whale (pictures were provided as a guide), then to modify it by moving its tail. The following week, the same procedure was repeated, with each pair of students using the other program.
The difficulties encountered by the students were compared with the difficulties that we predicted from our prior OSM analyses. As predicted, most students learned to use the basic facilities of J-Sketch readily. The results for ClarisWorks were less clear-cut. Of the nine specific difficulties encountered by subjects, 4 had been predicted, but 5 had not. Additional predictions were made about aspects of the program that subjects never got around to exploring. Some of the unpredicted difficulties were ones that an OSM analysis would not be expected to highlight; for example, students commented that they had difficulty getting the shape acceptable -- a point that might emerge from several aspects of the system being difficult to work with, but not one that would emerge directly from an OSM analysis. Other difficulties could have been predicted but our own OSM models had been inadequate, we later realised, because we had based our descriptions on documents that gave incomplete information.
To assess the usability of the OSM we asked the same 20 subjects to produce their own OSM descriptions of the same systems, after the "usefulness" study had been completed. We found that:
Their subsequent usability reports on OSM confirm the evidence from the data -- that entities and actions were easily comprehended and described, but that relationships presented more difficulties, and that few of the subjects really understood how the modelling was meant to be used to derive usability assessments of the system.
While these initial results are promising, since they represent modelling activity after a very short period of training and practice, their greatest value lies in the way they have been used to inform re-design of the OSM. It should also be noted that the applications that we studied, being characteristic of their class, did not contain certain types of potential user problem.
There are many different HCI modelling approaches. Very few of them are in serious use, for a variety of reasons. To make OSM genuinely useful, at least three steps are needed. We want to make it highly accessible; we want to demonstrate its effectiveness in real contexts; and we want to extend its coverage to collaborative situations.
Accessibility One way to improve OSM accessibility is by developing a web-site devoted to it. We have started such a site by giving short OSMs of a number of familiar devices (see Further Reading). In the future, funding permitting, we shall greatly extend this site with more examples and with a guide to using the approach.
Our existing prototype OSM editor needs to be further developed (and its own usability needs to be tested!). Having done so, it should be possible in principle to develop an interactive web-based version which can be used remotely by any designer. The model can be set up over the web and when complete it can be submitted to the Prolog analysis program. Usability alerts will then be readily available as part of every designer's toolkit.
Real contexts The purpose of setting up an interactive, web-based site is, of course, to examine OSMs effectiveness in real use. If not accessible, it will not be used. By making it accessible, we shall be able to determine its strengths and weaknesses. At one extreme, it may transpire that nobody uses it twice; at the other, it could become a regularly-used design step.
Before trying to make it accessible in such a wide context, we intend to study its use in more conventional ways, by testing it out with student designers and by exposing it to the critical eyes of practising designers.
Coverage There can be person-person misfits as well as user-device misfits -- e.g., misfits between the conceptual models held by different participants in a work-system. For instance, one person may need to satisfy constraints that another person is unaware of or unconcerned with. So our approach could in principle be applied to collaborative work. At present we have not explored that possibility, but future developments will, we hope, take us in that direction.
More detail about OSM and the experiment mentioned above.
Blandford, A. and Green, T. R. G. (1997) OSM: an ontology-based approach to usability evaluation. Workshop on Representations, Queen Mary College London, 1997.
Postscript version: http://www.uclic.ucl.ac.uk/annb/RepWkshp.ps
Rich Text Format (RTF) version: http://www.ndirect.co.uk/~thomas.green/workStuff/OSMs_Workshop_paper.rtf
Blandford, A. and Green, T. R. G. (1997) Design and redesign of a simultaneous representation of conceptual and device models. Submitted.
Postscript version: http://www.uclic.ucl.ac.uk/annb/OSM-DR.ps
An OSM web site In addition, examples of OSM models for some simple objects such as analogue watches are available from this site. (The models are presented using the first version of the OSM approach rather than the redesigned version described here, but the differences are not substantial.)
Bellotti, V. (1989) Implications of current design practice for the use of HCI. In D. Jones & R. Winder (Eds.) People and Computers IV, Proceedings of HCI '89, 13-34. Cambridge University Press.
Blandford, A. E. & Young, R. M. (1996) Specifying user knowledge for the design of interactive systems. Software Engineering Journal. 11.6, 323-333.
Card, S. K., Pirolli, P. and Mackinlay, J. D. (1994) The cost-of-knowledge characteristic function: display evaluation for direct-walk dynamic information visualizations. In Adelson, B., Dumais, S. and Olson, J. (Eds.) CHI '94: Human Factors in Computing Systems. New York: ACM Press.
Cawsey, A. (1993). Explanation and Interaction: The Computer Generation of Explanatory Dialogues. Cambridge, MA: MIT Press.
Green, T. R. G. (1989) Cognitive dimensions of notations. In R. Winder and A. Sutcliffe (Eds), People and Computers V. Cambridge University Press
Green, T. R. G. (1990) The cognitive dimension of viscosity - a sticky problem for HCI. In D. Diaper and B. Shackel (Eds.) INTERACT '90. Elsevier.
Green, T. R. G. (1996) The visual vision and human cognition. Invited talk at Visual Languages '96, Boulder, Colorado. In W. Citrin and M. Burnett (Eds.) Proceedings of 1996 IEEE Symposium on Visual Languages. Los Alamitos, CA: IEEE Society Press, 1996. ISBN 0 - 8186 - 7508 - X
Green, T. R. G. and Benyon, D. (1996.) The skull beneath the skin: entity-relationship models of information artifacts. Int. J. Human-Computer Studies, 44(6) 801-828.
compressed (80 kB): ftp://ftp.mrc-apu.cam.ac.uk/pub/personal/tg/Skull_v2.ps.gz
uncompressed (305 kB): ftp://ftp.mrc-apu.cam.ac.uk/pub/personal/tg/Skull_v2.ps
Green, T. R. G. and Petre, M. (1996) Usability analysis of visual programming environments: a 'cognitive dimensions' framework. J. Visual Languages and Computing, 7, 131-174.
compressed (180 kB): ftp://ftp.mrc-apu.cam.ac.uk/pub/personal/thomas.green/VPEusability.ps.gz
uncompressed (2.2 MB): ftp://ftp.mrc-apu.cam.ac.uk/pub/personal/tg/VPEusability.ps
Moran, T. P. (1983) Getting into a system: external-internal task mapping analysis. Proc. CHI 83 ACM Conf. on Human Factors in Computing Systems, pp 45-49. New York: ACM.
Payne, S. J. (1993) Understanding calendar use. Human-Computer Interaction, 8, 83-100.
Whitelock, D., Green, T. R. G., Benyon, D., and Petre, M. (1994) Discourse during design: what people talk about and (maybe) why. In R. Oppermann, S. Bagnara and D. Benyon (Eds.) Proceedings of ECCE- 7, Seventh European Conference on Cognitive Ergonomics. Sankt Augustin: Gesellschaft für Mathematik und Datenverarbeitung MBH. GMD-Studien Nr 233.
Yang, S., Burnett, M. M., DeKoven, E. and Zloof, M. (1998) Representation design benchmarks: a design- time aid for VPL navigable static representations. Journal of Visual Languages and Computing, 8 (5/6), 563-599.