An Adventure in Agile User Testing
Hero image

I tried UserTesting.com and found it great for quickly conducting usability tests on consumer apps. The fast turnaround allowed for agile user testing and iterative design. Surprisingly, it performed on par with in-person testing, but had limitations for complex enterprise apps.

Last week I cut my usability-test-participant recruiting time from one week down to one minute. Granted, it was an easier demographic than usual – American adults with a pulse, not my usual enterprise criteria like network security administrators or sales account executives – but considering that recruiting is my least favorite part of my job, I was pretty happy with this new method.

Fast was the theme of my trial of UserTesting.com. Within one week, I did two mini-studies, identified some key issues in the domain I was studying: online grocery delivery services.

The rapid turnaround opens up possibilities for a fundamentally different design process. The traditional 6-week 12-user 90-minute moderated usability study corresponds to a waterfall development process for enterprise software. With unmoderated testing, you can do a 3-day 6-user study with 15-minute sessions, get some key insights, iterate your design and/or iterate your test, and do a follow-up study in the same week. It’s agile user testing for agile design and development.

Testing UserTesting

To test out UserTesting.com, I did a sample study of online grocery delivery services. Are services like Amazon Fresh, Google Shopping Express, and Instacart the future of grocery shopping? What do people like about them, what do they need to improve, and what will it take for them not to meet the fate of WebVan?

To investigate, I did two mini-studies.

First an overall concept test: what do people think of the idea of shopping for groceries online instead of in person? what concerns do they have? when would they use it? what would they buy?

Then a comparative usability test: how does selecting products compare between Instacart and Google Shopping Express?

So what’s the verdict? As you may have just seen for yourself, shoppers were generally open to the idea of online grocery shopping, and excited about its convenience.

The sessions captured many of the types of insights that we might find in an in-person study. In this case:

  • People like touching things in person, but the convenience of online shopping often trumps the personal touch, at least when you have time to wait for delivery. Convenience varied by demographic: the single car-free shopper would use the service for heavy and bulky items, while the mom would use it primarily for fruits and vegetables, which her kids consume ravenously

  • Concerns include pricing for small orders due to delivery fees, and ability to buy less than preset weights of bulk goods

  • Part of the convenience is having food delivered while you’re away, but some of the service require customers to be home for delivery

  • Seeing more products on a screen trumped large image sizes

  • Shoppers wanted the ability to compare products in multiple browser tabs, but Instacart does not allow users to right-click and open in a new tab

  • The online grocery shopping experience involves a lot of reading, and was an emotionally neutral transactional experience, not an inspirational magazine-like experience

So that was the evaluation of the online grocery services; what about the evaluation of UserTesting.com?

Great for testing consumer apps

Suffice it to say that I really liked the unmoderated approach to usability testing, at least for consumer websites and apps.

  • Fast and easy recruiting – “Someone is already doing it!” was one of the first things out of my mouth about a minute after I launched my first study. Especially for broad consumer demographics – like in our example grocery delivery study – recruiting from UserTesting’s pool of a million testers was quick and easy. We were initially skeptical about the quality of recruits, but were quite satisfied with our participants. Recruiting a user in about a minute and getting results in about half an hour is a real game-changer for getting user feedback on consumer apps. If you have a general consumer app or website, there’s no excuse not to run a quick user study as part of your design process.

  • Power of video – Video clips convey emotion, not just hard data. Emotion drives action; getting videos in front of stakeholders gives them direct experience with users that drives them to act to make the product more user-centered, not just listen to a report in a meeting and go on with their existing priorities. UserTesting makes video practical; it doesn’t take hours to edit clips like it does when you run hour-long in-person studies. Leaving out the moderator also makes the videos a bit more convincing.

  • Focused tests – the 15-minute time constraint was the biggest initial hurdle in adjusting to unmoderated testing, but I’m coming to appreciate it. Like the character limit on tweets, it forces you to edit down to the pith. Each study becomes focused on a single objective. And the responses are more focused, so the analysis is faster, and you can run a more focused meeting with stakeholders to discuss your next design iteration.

Surprisingly on par with in-person testing

I expected that unmoderated remote usability testing would be limited to narrow studies to identify usability barriers in pre-defined tasks, but actually I was able to do most of the usability activities that we do in-person over UserTesting.

  • Qualitative questions – I’m big on understanding the context of use by starting usability sessions with a semi-structured interview where I collect stories about what the user does in real life. I thought that it would be hard to replicate those types of conversations in an automated tool, but I was quite satisfied with the insights we got from participants’ verbal answers. You can’t personalize and probe as much like in a semi-structured interview, but the structure makes it easier to compare results, and knowing that there’s no back-and-forth, participants tended to provide more complete answers upfront.

  • Probing on design elements – Often clients are interested in getting feedback on a specific design element, for example a filtering tool on an e-commerce website. Not everyone uses the element of interest, and even if they do, they don’t necessarily comment about what they like or don’t like about it. In-person we handle this by prompting the user to try it out and give their feedback. I thought these conditional prompts would be impossible with an unmoderated study, but in fact I learned that you can accomplish it by adding tasks like: “If you found the shirt without the filter options, use this link and try the filter option.” There’s a limit to this – for example you can’t ask the participant what’s confusing to them like you could in-person when you see them furrowing their eyebrows – but you can probe on the main design elements that you want to evaluate.

Limited for testing complex enterprise apps

Overall I found UserTesting very well suited for evaluating most consumer apps and websites, but in its current state, I don’t expect that it would be useful for most enterprise products.

  • Not a magic bullet for recruiting specialized demographics – UserTesting has a general consumer panel, and you can ask them to self-select for common consumer roles like “parent” or “soccer player,” but I wouldn’t expect to get instant recruits for criteria like nurses, network administrators, or users of an obscure product. They have relationships with recruiters and you can give them your own list of participants, but this only works for desktop websites, not mobile apps, and it isn’t much different than directly going the recruiter or panel provider for your own in-person study.

  • Not a magic bullet for benchmarking – I expected that one of the main applications of unmoderated testing would be to automate the most mechanical version of usability testing, the summative benchmark test. I expected the tool would automatically count errors, time-on-task, and task success, so I could just up the number of participants to increase the statistical power of the benchmark. However, UserTesting does not collect those statistics, and it also cannot do assists, so it doesn’t radically change the time it takes to do a benchmark study.

  • May not want to leave it to chance with hard-to-find recruits – If you have a hard-to-recruit hard-to-schedule demographic like doctors, and if you’re able to schedule an hour with 8-12 of them, for their sake and your sake, you might not want to leave it to chance that they will interpret your unmoderated tasks correctly, that the prototype will work, that they’ll move along through all the tasks quickly enough instead of getting caught on their pet issue, and that they’ll keep paying attention through the end of the session. An in-person moderator can improvise when things go wrong, give assists when needed, and can give the participant human attention to show that we care about their feedback. Like a phone tree versus a human customer service agent, the phone tree works, until it doesn’t.

Agency value-add

So why are we, a company that does full-service usability testing, discussing a cheap and easy usability testing service?

Just like SurveyMonkey has made it easy for anyone to deploy their own surveys and even draw from a question bank, sometimes it makes sense to do it yourself, and sometimes you need an expert to plan the questions and analyze the results, to go from data to insights, and to integrate it all with your product development process to help you accelerate towards product-market fit.

I’m looking forward to adding unmoderated user testing to our user experience toolset.