Greenberg and Buxton have listed in their paper a number of problems related to usability testing, and situations where usability evaluation can even be harmful ”if naively done ’by rule’ rather than ’by thought’”. Their main message is:
”the choice of evaluation methodology – if any – must arise from and be appropriate for the actual problem or research question under consideration.”
The first problem they describe is the heavy push for usability evaluation in HCI practice, research and education. Usability evaluation after each iteration cycle is considered a compulsory part of the development process if user-centred design is favoured, and top conferences hardly accept papers that present novel designs without usability evaluation. Although there are numerous usability evaluation methods available, laboratory-based user tests, controlled studies and usability inspections are heavily stressed in education, research and practice.
Greenberg and Buxton admit that usability evaluation is ”truly beneficial in many situations”, but may give ”meaningless or trivial results”, and even hinder revolutionary innovations in some other contexts. They raise up the problem of usability evaluation being mostly quite weak science, i.e.,
– research questions are easily chosen to meet the methods favoured by review committees and not vice versa,
– instead of truly refuting a new design, HCI practitioners and researchers tend to create situations favourable to the new design and thereby do only a confirmative hypothesis testing showing that at least one situation exists where the new design out-wins the old ones,
– replications of the usability tests are seldom done and hardly ever published although variations and different nuances of the original studies should be further investigated to get a holistic view of the new design,
– quantitative empirical evaluations are too much emphasised over other methods, and they give means to present subjective phenomena as something scientific and factual. Still, these methods do not give room for users’ arguments or intuitions about the evaluated design.
The next problem that Greenberg and Buxton bring up is the problem of evaluating early designs. Early design sketches that are meant to present a novel idea with a still immature technology might be compared to technologies already in use and therefore be judged as too hard to use by test users, new and parallel ideas might be dropped out too soon if the development process focuses on iterating one design right instead of searching for the right design. As the last entry to this problem list, Greenberg and Buxton point out that the controlled studies that HCI field favours give little or no room for the users to assess the usefulness of the design or to come up with new ways of utilising the design in their everyday life. This cultural adoption is hard to foresee but is one of the key factors in making new successful innovations.
So, what should we do? The authors give five initiatives to the HCI community:
1. The HCI community should recognise that usability evaluation is just one method in user-centred design, and should be used only when appropriate.
2. During a product development process, we need to consider the pros and cons of usability evaluation in different situations, and do evaluations only when they are likely to produce meaningful results.
3. The HCI community should stop demanding usability evaluations for every design to get a paper published.
4. The HCI community should start to favour rigorous science with replication studies and risky hypothesis testing.
5. The HCI community should learn from other disciplines also other ways of assessing the worthiness and usefulness of new designs in addition to usability evaluation. Design studios with interactive group discussions and reflections with the design team are brought up as one example of these other methods.
Overall, the article gives much food for thoughts, and argues for ”a change in culture in how we do our research and practice, and in how we train our professionals”. As an academic researcher, educator and some sort of a practitioner, I share many of the Greenberg’s and Buxton’s thoughts. Too often, students conduct usability evaluation by rule rather than by thought, without meaningful research problems or realistic goals. A change in attitudes and practices would be welcome, and the problems require our attention as educators and researchers.