This talk was a shortened version of a talk given at Visual Languages 96. Some of the illustrations I used in that talk are available from my home page. From there you can also download a copy of a much more detailed paper by Green and Petre, which appeared in the Journal of Visual Languages and Computing 1996.
The problem I address is, how to evaluate the usability of information-based artefacts and notations. More particularly, how to evaluate them cheaply. No laborious user-testing, no detailed analyses and modelling by HCI experts.
HCI folk have slowly learnt that expensive, time-consuming evaluative methodologies are not taken up by creators, for very good reasons. I propose something different, a frame-work of user-centered discussion tools.
We all have concepts that are vaguely known but unformulated. Discussion tools are elucidations of such concepts. If they resonate with your experience, they can promote a higher level of discourse amongst you, the designers and creators. They can create goals and aspirations, promote the reuse of good ideas in new contexts, and provide a basis for informed critique. Standard examples can become common currency and best of all, once concepts are named and exposed, their interrelationships can be appreciated.
The set I propose is called the cognitive dimensions framework, a still-unfinalised set of about a dozen terms such as 'viscosity', 'premature commitment', 'abstraction level'.
The problem of usability evaluation has been attacked in three ways. One way is to perform user testing, in which users are watched while they use the system. This is expensive and somewhat artificial: users in a laboratory, performing specified tasks, are different from users in the wild. Moreover, it takes far too long. Designers cannot hang around for the necessary weeks, then make a small modification and repeat the cycle. Results from lab testing are valuable but in general they are too expensive and not 100% trustworthy.
The second way is to use predictive user models. This is much cheaper and has been extremely successful in some cases. Much the commonest technique is GOMS, in which all the users' core tasks are scrutinised in detail, possible methods are worked out, and the time required for the user actions to accomplish these methods is predicted in detail. Although GOMS has achieved considerable success as a cheaper and quicker alternative to user testing, it is still an expensive undertaking, and it requires the assistance of HCI experts. Moreover, it really needs to be performed on the finished system, since the time-predictions it deals in depend on physical and perceptual characteristics of the interface display. (See Olson and Olson, 1990, for one of many accounts of GOMS).
I would certainly recommend designers of information retrieval systems to use GOMS rather than to perform user testing. But my purpose here is describe my cognitive dimensions framework, one of a new generation of lightweight, approximate evaluation methods which constitute the third type of attack on the problem of usability evaluation.
The cognitive dimensions framework is not an analytic method. Rather, it is a set of discussion tools. My purpose is to provide a way in which some evaluation can be done by the designers themselves.
I believe what we need is to improve the quality of discussion. Experts make sophisticated judgements about systems, but they have difficulty talking about their judgements because they don't have a shared set of terms. Also, experts tend to make narrow judgements, based on their own needs of the moment and their guesses about what other people may need; and other experts don't always point out the omissions. Again, if they had a shared set of terms, and that set was fairly complete, it would prompt a more complete consideration.
In short, experts would be in a good position to make useful early judgements, if (i) they had better terms with which to think about the issues and discuss them, and (ii) there was some kind of checklist. The terms might or might not describe a new idea; most probably, the expert will recognise a concept as something that had been in his or her mind, but had never before been clearly articulated and named.
Discussion tools are good concepts, not too detailed and not too woolly, that capture enough important aspects of something to make it much easier to talk about that thing. They promote discussion and informed evaluation.
To be effective, they must be shared - you and I must have the same vocabulary if we are going to talk. And it is better still if we share some standard examples. And it is best of all if we know some of the pros and cons - the trade-offs between one thing and another.
They have many advantages:
What discussion tools do not need to do, is do describe novel ideas. If they just give a name to something you had often thought about but thought about giving a name to, that's fine. But sometimes they might be ideas that are new to you, and that's fine, too.
Figure 1 illustrates a real-life discussion without the benefit of discussion tools; Figure 2 shows how it might have been if the participants had possessed shared concepts - shorter, more accurate, and less frustrating.
The previous section illustrated the idea of discussion tools; the 'cognitive dimensions' framework, first introduced in Green (1989), is meant to provide discussion tools to help people who are not HCI experts in making quick but useful evaluations.
I believe that taken together, the cognitive dimensions describe enough aspects to give a fair idea of how users will get on with a system, and can help both designers and users think and talk about the system. Each dimension describes one aspect of a system, something that affects how users will manage. The framework contains about 12 dimensions, but in this document I shall not go into detail.
Something that is cognitively hard in one environment may be much easier in another. As an extreme example, writing a Pascal program over the phone is not recommended, even though Pascal has an easy syntax for writing on paper. The properties also depend on the tools available in a given environment; a word processor with no search-and-replace is a different system to one that has a powerful and easy-to-use tool.
You can fix any kind of difficulty, either by changing the notation or by changing the environment, but you usually pay for it with another kind. For example the search-and-replace tool fixes one form of 'viscosity' but it introduces a new and slightly higher level of abstract thought. Search tools that use regular expressions are an even better fix - but they introduce a very much higher level of abstract thought.
I shall illustrate some of these trade-off relationships below.
The global spelling corrector might seem a good idea, but is it always? Sometimes you want to make sure the user looks at every single case separately.
In all the examples I shall give, you should remember that different circumstances might demand different designs.
A notation might be as good as you like, considered just as a static entity, but what needs to be evaluated is the whole process of using it: building or writing in it, debugging it, reading it, maintaining it over the years. Certain sorts of diagrammatic notations, such as diagrammatic query languages, are probably easier to understand than symbolic notations, but they are also harder to 'write' and harder to modify. These various aspects all need to be balanced out.
In its present form, the framework has 14 dimensions (Figure 3) although if I'm honest I think there are overlaps.
I obviously can't go through all of those here. On the other hand, to describe the trade-offs issue I have to at least give thumbnail sketches of three or four.
A viscous system resists change - you have to do a lot of work. For example, if you have produced a long and carefully formatted document and then someone tells you to change the style of all the level 2 headings, say a hundred of them, correcting each one individually is hard work ('repetition viscosity').
One solution is to create a 'style sheet' that defines a level 2 heading. By changing the definition, you can change all the headings.
In Green (1990) I distinguish repetition viscosity and 'knock-on' viscosity.
Change one thing, and who knows what might fall over? That's the hidden dependency problem. Spreadsheets are a fine example. So are some kinds of style sheets; if one style is defined in terms of another, changing the parent might give you an unpleasant surprise.
The key aspect is not the fact that A depends on B, but that the dependency is not made visible.
I like to think of this as the number of new high-level concepts that have to be learnt to make use of a system, such as 'style sheet' or 'regular expression'. Each new idea is a significant barrier to learning and acceptance. High-level ideas - that is, ideas that do not refer to easily-produced concrete instances - are very much harder still.
Let's continue with the word-processor example. If you decide to use styles, notice that you have to decide what styles you want and how they are related very early. Too soon, sometimes. Afterwards, you might wish that you had defined something like "inset block quotation [no space afterwards]" as a child of 'inset' rather than as a child of 'quotation' - but it's probably too late now; the viscosity of the system would make it too much work to redefine everything.
Premature commitment occurs in all sorts of places. Try drawing a map to guide someone to your house. What's the betting you start too close to one side of the paper, or start at the wrong scale, and the last few turnings are all scrunched up?
Or try working out a few formulae with a pocket calculator. How often do you get in a knot because you've started entering the formula in a way that makes the computation extra hard?
In the examples I have given, I have repeatedly illustrated how fixing a problem in one dimension leads to a problem with another dimension. A sort of law of conservation of cussedness.
However, the designer can choose. He or she can fix the viscosity problem by increasing the abstraction level OR by changing to a different kind of notation. Using more abstractions is the commonest solution, but by no means the only one.
I like to compare the cussedness of information structures with the behaviour of ideal gases. Three quantities, temperature, pressure and volume, describe an ideal gas. If you want to increase the temperature, you can keep the pressure constant (but the volume must be allowed to increase) or you can keep the volume constant (but the pressure must be allowed to increase). Taken in pairs, these three dimensions are orthogonal. But you cannot raise the temperature while holding constant both the pressure and the volume.
\The parallels may not be exact, but they are intriguing.
Figure 4 illustrates some of the trade-off relationships that are frequently observed. We have seen how viscosity can be reduced by introducing more abstractions; but getting the abstractions right demands thinking ahead (i.e. there is a premature commitment problem). Viscosity increases the cost of premature commitment; if the abstractions themselves are viscous, then getting them wrong means you're in trouble. Furthermore, all too often abstractions introduce problems of hidden dependencies, because one abstraction is defined in terms of another.
Secondary notation and visibility were not discussed above.
My approach is very easy to use.
What you get out of this approach is a rough and sketchy evaluation. As we saw back in Figure 1, it will correspond to what users talk about. And if you were to consider changing the design, it will alert you to some of the possible trade-off consequences.
What it will not do is give precise time estimates. For that you should use GOMS (see above).
Nor does the question of users' knowledge get much attention in my framework. A much more thorough approach to users' knowledge has been developed by Lewis et al. (1991). In their 'cognitive walkthrough' methodology, the emphasis is on how the user knows what to do next.
For best results, I think all three methods could be employed, since they address different facets of the problem.
In practice, the cognitive dimensions approach seems to have hit the right note for many people. It has been tried as a teaching tool and as a simple evaluation method, in both cases with success.
In the field of information retrieval not much has been done with it, but I think it would e a good way to make a preliminary evaluation of the usability of a system, rather than by going straight for expensive user testing.