Jan
27

The Weekly Echo 1/27

posted by: Kimra McPherson

We’ve got a nickname-filled office to begin with, but things really picked up this week when we spotted this chart of blues names making its way around the internet. The EchoUser Experience will now be known as Texas Chicken Green.

A smattering of EchoUser office reactions to this plush desk nap pod:

“How do you breathe?

“It looks cozy … until you suffocate”

“It’s an ostrich-bag-thingy.”

“It’s a fuzzy deep-sea-diving helmet.”

This color-matching game is just as addictive (if not more so) than the kerning game from the same folks that we loved a while back — but it feels much more stressful!

Fine, fine, it’s not all fun and games around here. We learned a lot this week from this Uday Gajendar presentation on how to partner with a UI designer.



0
Comments | Post a New Comment


Jan
26

Calculating Usability

posted by: Sally Tang

The EchoUser research team had quite a busy December. Our schedules were filled with recruiting users, drafting test plans, moderating usability sessions, writing reports, and, last but not least, arranging check-in meetings with clients throughout the project cycle.

Clients — regardless of their UX background — would raise questions and concerns about UX methodology in those meetings to make sure that their studies were on the right track and that they would get valuable and defensible data from the projects.

In the two usability projects I am on (both benchmark studies), I came across the following two interesting questions from our clients. Though the two questions seemed to have come from two different angles, they both point to one of the key issues in doing usability studies: how to interpret usability data with a small number of users. I thought I’d share the two client questions and hope to elicit some extended discussions here.

Client Question 1: How many participants is enough for a benchmark usability study? Eight, 10, or 12?

A lot of times, the question actually becomes, “Do we need a single-digit participant number or a double-digit one?” Clients want the usability study results to be defensible both from a statistical and a PR standpoint. When time and resources allow and it’s easy to recruit target participants, the question of “Should we get two more participants for the study?” has an easy solution: Let’s just do two more sessions. However, in a scenario in which qualified participants are very difficult to find or recruit (for instance, the study requires a highly specific user profile) or time and resources are limited, how many participants are needed? Is it worthwhile to spend two more weeks on the study just to make it to a total of 10 participants?

The bigger issue: What is the rationale we should use to validate the number of participants for a usability study?

If we go back to the classic model from Nielsen, five users are enough to uncover 85% of usability issues. That has been the UX industry standard’ for a long time, as Jakob Nielsen and his colleagues were among the first UX professionals to calculate the relationship between the number of UX issues uncovered and the number of participants involved. The mathematical model is derived from their years of experience conducting usability studies. Faulkner challenged Nielsen’s model in 2004 with a paper named “Beyond the five-user assumption: Benefits of increased sample sizes in usability testing.” She carefully designed and conducted a few studies with different sample sizes (5, 10, 20, 30, 40, 50, and 60 participants). What she learned from the follow-up data simulation and analysis is that 10 participants are enough to identify at least 82% of the usability issues, whereas a sample size of 15 can help to identify at least 90% of the issues. I even came across a sample size calculator on Jeff Sauro’s Measuring Usability site. Based on the binomial probability formula, it allows you to calculate, for instance, how many users are needed to discover 80% of the usability issues when all issues’ probability of occurrence is above 30%.

All of the above can be used as reference rationales to validate using a certain number of participants for a study. However, as specifically mentioned in Faulkner’s paper, having a highly representative user sample is crucial in uncovering the priority usability issues. Indeed, beyond all those statistical models, getting the right users is sometimes as important as (if not more important than) getting enough users.

Client Question 2: Are we telling the product team that 80% of our customers will fail to use this functionality because 8 out of 10 users failed in the usability study?

Well, the primary purpose of usability studies is to discover qualitative usability issues with an interface, as opposed to predicting the probability of those issues’ occurrence. However, the task completion rate is one of the key metrics we use to evaluate the usability of different UI features, and it is our responsibility to give clients and the product team a clear idea of how to interpret the completion rate.

The confidence level of the results is, again, closely related to the number of users included in the study. From a statistical standpoint, it’s not difficult to understand that the more users in the study, the more confident we can be in the results. However, with only 10 participants, how confident can we say we are in our results?

John Sorflaten has an interesting article discussing this topic. He put forward the limitation of using task success data to predict customer behavior on a larger scale. He recommended using the Adjusted Wald Interval calculator coded by Jeff Sauro to generate the lower and higher bounds of the task success data.

For instance, if 8 out of 10 participants succeed in a task, how could this data be used to predict 1,000 or 10,000 users’ behavior? By using a confidence level of 95% (if you run the same test 100 times, 95 of the times the results will fall within the acceptable +/- margin), Jeff’s calculator generates a lower bound of 48% success and a higher bound of 96% success based on the 80% task success rate from the usability study and accounting for the small sample size. And the same is true if 8 out of 10 participants fail in a task: The calculator predicts a chance of as few as 48% or as many as 96% of users failing the task when the UI is actually released and on the market.

In that sense, as opposed to using the 80% task success rate to predict broader user behavior, we as usability professionals can show the range between 48% and 96% as a reference range for the product manager or marketing team to make further interpretations or decisions.

Next time, when clients are debating between 8 or 10 participants, or the product manager is asking why the task completion rate does not match large-scale user data, these basic stats will help to answer the questions.



0
Comments | Post a New Comment


Jan
24

Enterprise Software and Nielsen’s Heuristics

posted by: Noah Kersey

A little while ago we were asked:

“Jakob’s famous heuristics” refers to the 10 rules of thumb that Jakob Nielsen has developed and promoted which commonly form the basis for the discount usability method of heuristic evaluation.

In my view, the short answer is the guidelines can be very helpful, each of them has something to offer when building or evaluating enterprise applications, BUT, and this is a big but, in the domain of enterprise software it can be harder to adhere to them compared to a consumer setting. I was talking with my colleagues about this, and @kimretta crystallized what I was trying to express nicely: in the enterprise setting, instead of just having to design for an interaction between a system and a user, the interaction is between a user, a system and a business.

All too often the system and the business get together first, and the end-user doesn’t get much time or attention. Currently, and I will overgeneralize a bit here, the development and deployment process of much enterprise software doesn’t include a person focused and empowered to champion user centered design. The developer is not the user, the purchaser is not the user, and the people who implement and support the IT infrastructure are not the user. Software gets developed to meet business needs, not user needs. And perhaps I’m revealing my social science perspective here but I think it is critical to realize that in some cases, user centered design changes in enterprise software require organizational change in the enterprise, something that is difficult in the best of times.

What we see in our practice however is that neglected user needs in the enterprise come back to haunt the business, through lost productivity and reduced worker satisfaction.

What’s up with Enterprise software?

It is easy to find critiques of enterprise software (I enjoyed this entry on the topic).  Common refrains for why enterprise software typically offers a substandard user experience include: the disconnect between those who purchase and those who use, legacy lock-in, a ‘more-is-better’ mind-set about adding features, and the way enterprise software is first and foremost aligned to business rules, at times with a seemingly blatant disregard for the humans who need to interact with them.

For each heuristic that Jakob calls out,  I could tell a story about a piece of software where I’ve personally seen the negative effects on users caused by disregarding that heuristic. Sometimes the negative effects are small, mitigated by ‘software calluses’ users develop as they learn the idiosyncrasies of a particular system. But sometimes the effects are large, resulting in users looking for any way possible to avoid using the software, reducing productivity, efficiency and morale. Some of the 10 may not apply as much as others in particular contexts, and we definitely see niche- and expert-user audiences able to adapt to systems that have less polish and fewer affordances for new users in spite of things we might otherwise consider problematic.

For example, take a heuristic like “Match between system and the real world” which I frequently see enterprise software struggle with. Remember the three parts I mentioned earlier, system, business and user? Which real world are we to now design for? Inevitably, employees understanding of the ‘real world’ of the business (perhaps incomplete, based on role) is different than the employees’ ‘real world’ outside of work (which may be influencing their expectations about how interfaces should behave). Also the ‘real world’ of each business unit may be different in non-trivial ways, even for people using the same software at the same company, not to mention between companies. Building and customizing software in this environment is a different challenge from crafting a more singular consumer offering.

Browser UX

One trend that is relevant here is the move to browser-based interfaces.  This is not new of course, but over the last decade or two the software frame that one uses to access enterprise tools is increasingly the same one used for personal activities. As a result I have seen expectations about the enterprise tools overlap higher-quality consumer level offerings. If someone sees a particularly helpful feature while using other sites on the web, they will ask about their enterprise app: “Why can’t my work program do that?”

Enterprise = just plain tough?

The current state of enterprise user experience seems to be improving, however there is a long way to go, and Jakob’s heuristics continue to be relegated to the “nice-to-have” category, to end-users detriment.

This is not to say that Jakob’s heuristics are a universal or final answer, because they are not. As enterprises pursue native app development on mobile devices, tablets, or on our Minority Report platforms of the future, there will always be the need to look at a specific context  to see what makes sense. But just because integrating general, non-enterprise-specific heuristics is hard in the enterprise environment doesn’t mean it should be disregarded. Making enterprise software more usable by consumer standards will create benefits in the longer term, with more efficient workplaces and less time accommodating systems that offer a poor fit to the work that needs to get done.



0
Comments | Post a New Comment