0% found this document useful (0 votes)
25 views17 pages

Lecture - Measuring The User Experience

Lecture - Measuring the User Experience

Uploaded by

iammachash7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views17 pages

Lecture - Measuring The User Experience

Lecture - Measuring the User Experience

Uploaded by

iammachash7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Measuring the User Experience

Collecting, Analyzing, and Presenting Usability Metrics

Chapter 3
Planning a Usability Study

Tom Tullis and Bill Albert


Morgan Kaufmann, 2008
ISBN 978-0123735584

Introduction

ƒ What are the goals of your usability study?


ƒ Are you trying to ensure optimal usability for a new
piece of functionality?
ƒ Are you benchmarking the user experience for an
existing product?

ƒ What are the goals of users?


ƒ Do users complete a task and then stop using the
product?
ƒ Do users use the product numerous times on a
daily basis

ƒ What is the appropriate evaluation method?


ƒ How many participants are needed to get reliable
feedback?
ƒ How will collecting metric impact the timeline and
budget?
ƒ How will the data be collected and analyzed?

1
Study Goals

ƒ How will the data be used within the product


development lifecycle?

ƒ Two general ways to use data


ƒ Formative
ƒ Summative Formative
Chef who periodically checks a dish while
it’s being prepared and makes
adjustments to positively impact the end
result

Summative
Evaluating the dish after it is completed
like a restaurant critic who compares the
meal with other restaurants
3

Study Goals

ƒ Formative Usability
ƒ Evaluates product or design, identifies
shortcomings, makes recommendations
Formative
ƒ Repeats process
Chef who periodically checks a dish while
it’s being prepared and makes
ƒ Attributes adjustments to positively impact the end
ƒ Iterative nature of testing with the goal of result
improving the design
ƒ Done before the design has been finalized

ƒ Key Questions
ƒ What are the most significant usability issues that
are preventing users from completing their goals
or that are resulting in inefficiencies?
ƒ What aspects of the product work well for users?
What do they find frustrating?
ƒ What are the most common errors or mistakes
users are making?
ƒ Are improvements being made from one design
iteration to the next?
ƒ What usability issues can you expect for remain
after the product is launched?
4

2
Study Goals

ƒ Summative Usability
ƒ Goal is to evaluate how well a product or piece of
functionality meets its objectives
ƒ Comparing several products to each other Summative
ƒ Focus on evaluating again a certain set of criteria Evaluating the dish after it is completed
like a restaurant critic who compares the
meal with other restaurants
ƒ Key Questions
ƒ Did we meet the usability goals of the project?
ƒ How does our product compare against the
competition?
ƒ Have we made improvements from one product
release to the next?

User Goals

ƒ Need to know about users and what they are


trying to accomplish
ƒ Forced to use product everyday as part of their
jobs?
ƒ Likely to use product only one or twice?
ƒ Is product a source of entertainment?
ƒ Does user care about design aesthetic?

ƒ Simplifies to two main aspects of the user


experience
ƒ Performance
ƒ Satisfaction

3
User Goals

ƒ Performance
ƒ What the user does in interacting with the
product

ƒ Metrics (more in Ch 4)
ƒ Degree of success in accomplishing a task
or set of tasks
ƒ Time to perform each task
ƒ Amount of effort to perform task
ƒ Number of mouse clicks
ƒ Cognitive effort

ƒ Important in products that users don’t have


choice in how they are used
ƒ If user can’t successfully complete key
tasks, it will fail

User Goals

ƒ Satisfaction
ƒ What users says or thinks about their
interaction

ƒ Metrics (more in Ch 6)
ƒ Ease of use
ƒ Exceed expectations
ƒ Visually appealing
ƒ Trustworthy

ƒ Important in products that users have


choice in usage

4
Choosing the Right Metrics
Ten Types of Usability Studies

ƒ Every usability study has unique qualities, ten scenarios provided with
recommendations for each

Choosing the Right Metrics


Ten Types of Usability Studies

Issue Based Metrics (Ch 5)


Task Success
• Anything that prevents task completion
Task Time
• Anything that takes someone off course
Errors
• Anything that creates some level of confusion
Efficiency
• Anything that produces an error
Learnability

Issue Based Metrics


• Not seeing something that should be noticed

Self Reported Metrics • Assuming something should be correct when it


is not
Behavioral and Physiological Metrics
• Assuming a task is complete when it is not
Combined and Comparative Metrics
• Performing the wrong action
Live Website Metrics
• Misinterpreting some piece of content
Card Sorting Data
• Not understanding the navigation

10

5
Choosing the Right Metrics
Ten Types of Usability Studies

Self Reported Metrics (Ch 6)


• Asking participant for information about their
perception of the system and their interaction with it
Task Success
• Overall interaction
Task Time
• Ease of use
Errors
• Effectiveness of navigation
Efficiency
• Awareness of certain features
Learnability • Clarity of terminology
Issue Based Metrics • Visual appeal
Self Reported Metrics • Likert scales

Behavioral and Physiological Metrics • Semantic differential scales

Combined and Comparative Metrics • After-scenario questionnaire


• Expectation measures
Live Website Metrics
• Usability Magnitude Estimation
Card Sorting Data
• SUS
• CUSQ (Computer System Usability Scale)
• QUIS (Questionnaire for User Interface Satisfaction)
• WAMMI (Website Analysis & Measurement Inventory)
• Product Reaction Cards

11

Choosing the Right Metrics


Ten Types of Usability Studies

Behavioral and Physiological Metrics (Ch 7)


• Verbal Behaviors
• Strongly positive comment
Task Success
• Strongly negative comment
Task Time
• Suggestion for improvement
Errors
• Question
Efficiency • Variation from expectation
Learnability • Stated confusion/frustration
Issue Based Metrics • Nonverbal Behaviors
Self Reported Metrics • Frowning/Grimacing/Unhappy
Behavioral and Physiological Metrics • Smiling/Laughing/Happy

Combined and Comparative Metrics • Surprised/Unexpected

Live Website Metrics • Furrowed brow/Concentration


• Evidence of impatience
Card Sorting Data
• Leaning in close to screen
• Fidgeting in chair
• Rubbing head/eyes/neck

12

6
Choosing the Right Metrics
Ten Types of Usability Studies

Combined and Comparative Metrics (Ch 8)


Task Success
• Taking smaller pieces of raw data like task
Task Time completion rates, time-on-task, self reported
Errors ease of use to derive new metrics such as an
overall usability metric or usability score card
Efficiency
• Comparing existing usability data to expert or
Learnability
idea results
Issue Based Metrics

Self Reported Metrics

Behavioral and Physiological Metrics

Combined and Comparative Metrics

Live Website Metrics

Card Sorting Data

13

Choosing the Right Metrics


Ten Types of Usability Studies

Live Website Metrics (Ch 9)


Task Success
• Information you can glean from live data on a
Task Time production website
Errors • Server logs – page views and visits
Efficiency • Click through rates - # times link shown vs. actually
clicked
Learnability
• Drop off rates – abandoned process
Issue Based Metrics
• A/B studies – manipulate the pages users see and
Self Reported Metrics compare metrics between them
Behavioral and Physiological Metrics

Combined and Comparative Metrics


Card Sorting Data (Ch 9)
Live Website Metrics
• Open card sort
Card Sorting Data
• Give participants cards, they sort and define
groups

• Closed card sort


• Give participants cards and name of groups, they
put cards into groups

14

7
Choosing the Right Metrics
Ten Types of Usability Studies
Metrics
ƒ Completing a Transaction
Task Success
ƒ Make transaction as smooth as possible
ƒ Well-defined beginning and end Task Time
ƒ Completing purchase, registering new software, selling Errors
stocks
Efficiency

ƒ Metrics Learnability

ƒ Task Success Issue Based Metrics


ƒ Scored as success or failure Self Reported Metrics
ƒ Need clear end-state to confirm task success
Behavioral and Physiological Metrics
ƒ Reporting percentage of success good measure of
effectiveness of transaction Combined and Comparative Metrics
ƒ Efficiency
Live Website Metrics
ƒ User must complete same transaction many time, good to
measure task completion per unit of time Card Sorting Data
ƒ Issue Based Metrics
ƒ Issue severity assigned to each usability issue, able to
identify and focus on high-priority problems
ƒ Self Reported Metrics
ƒ Likelihood to return
ƒ User expectations
ƒ Live Website Metrics (if transaction involves a website)
ƒ Drop-off rate from transaction may indicate problematic
steps in transaction
15

Choosing the Right Metrics


Ten Types of Usability Studies
Metrics
ƒ Comparing Products
Task Success
ƒ How does product compare to the competition?
ƒ How does product compare to previous releases? Task Time

Errors

ƒ Metrics Efficiency

ƒ Task Success Learnability


ƒ Being able to complete is a task correctly is essential
Issue Based Metrics
ƒ Efficiency
Self Reported Metrics
ƒ Gauged with task completion time, number of page
views, number of action steps taken Behavioral and Physiological Metrics
ƒ How much effort is required to use the product
Combined and Comparative Metrics
ƒ Self Reported Metrics
ƒ Provide a good summary of user’s overall experience Live Website Metrics
ƒ Satisfaction makes sense when user has choice of Card Sorting Data
products
ƒ Combined and Comparative Metrics
ƒ Want a big picture of how product compares from
usability perspective

16

8
Choosing the Right Metrics
Ten Types of Usability Studies
Metrics
ƒ Evaluating Frequent Use of the Same Product
Task Success
ƒ Need to be easy and highly efficient
ƒ Microwave, DVD players, web applications Task Time
ƒ Most people have little patience for products that are Errors
difficult to use
Efficiency

Learnability
ƒ Metrics
Issue Based Metrics
ƒ Task Success
ƒ Task Time Self Reported Metrics
ƒ Measuring time to complete set of core tasks Behavioral and Physiological Metrics
ƒ Reveal effort involved
Combined and Comparative Metrics
ƒ Helpful to compare task completion time to expert
ƒ Efficiency Live Website Metrics
ƒ Number of steps need Card Sorting Data
ƒ Time may be short, but separate decision can be numerous
ƒ Learnability
ƒ Time/effort required to achieve maximum efficiency
ƒ Measure is previous efficiency metrics over time
ƒ Self Reported Metrics
ƒ Awareness and usefulness
ƒ Find aspects of product that should be highlighted
ƒ Low awareness, but once they find it, they find it’s
extremely useful
17

Choosing the Right Metrics


Ten Types of Usability Studies
Metrics
ƒ Evaluating Navigation and/or Information
Task Success
Architecture
ƒ Users can quickly and easily find what they are Task Time
looking for Errors
ƒ Navigate around the product
Efficiency
ƒ Know where they are within the overall structure
Learnability
ƒ Know what options are available to them
Issue Based Metrics

ƒ Metrics Self Reported Metrics

ƒ Task Success Behavioral and Physiological Metrics


ƒ Typically done very early in design with wire frames or Combined and Comparative Metrics
partially functioning prototypes
ƒ Give participants key pieces of information to see how Live Website Metrics
well interface works
Card Sorting Data
ƒ Errors
ƒ Efficiency
ƒ Lostness – number of steps it took to complete a task
relative to minimum number
ƒ Card Sorting Data
ƒ Understand how participants organize information
ƒ Closed sort identifies how many items placed into
correct category and indicates intuitiveness of
information architecture
18

9
Choosing the Right Metrics
Ten Types of Usability Studies
Metrics
ƒ Increasing Awareness
Task Success
ƒ Aimed at increasing awareness of a specific piece of
content or functionality Task Time
ƒ Why is something not noticed or used? Errors

Efficiency
ƒ Metrics
Learnability
ƒ Live Website Metrics
Issue Based Metrics
ƒ Monitor interactions
ƒ Not foolproof – user may notice and decide not to click, Self Reported Metrics
alternatively user may click but not notice interaction
Behavioral and Physiological Metrics
ƒ A/B testing to see how small changes impact user
behavior Combined and Comparative Metrics
ƒ Self Reported Metrics Live Website Metrics
ƒ Pointing out specific elements to user and asking whether
they had noticed those elements during task Card Sorting Data
ƒ Aware of feature before study began
ƒ Not everyone has good memory
ƒ Show users different elements and ask them to
choose which one they saw during task
ƒ Behavioral and Physiological Metrics
ƒ Eye tracking
ƒ Determine amount of time looking at a certain element
ƒ Average time spent looking at a certain element
ƒ Average time before user first noticed it
19

Choosing the Right Metrics


Ten Types of Usability Studies
Metrics
ƒ Problem Discovery
Task Success
ƒ Identify major usability issues
ƒ After deployment, find out what annoys users Task Time

ƒ Periodic checkup to see how users are interaction with Errors


the product
Efficiency

Learnability
ƒ Discovery vs. usability study
Issue Based Metrics
ƒ Open-ended
ƒ Participants may generate own tasks Self Reported Metrics

ƒ Strive for realism in typical task and in user’s Behavioral and Physiological Metrics
environment Combined and Comparative Metrics
ƒ Comparing across participants can be difficult
Live Website Metrics

Card Sorting Data


ƒ Metrics
ƒ Issue Based Metrics
ƒ Capture all usability issues, you can convert into type and
frequency
ƒ Assign severity rating and develop a quick-hit list of design
improvements
ƒ Self Reported Metrics

20

10
Choosing the Right Metrics
Ten Types of Usability Studies
Metrics
ƒ Maximizing Usability for a Critical Product
ƒ Instead of striving to be easy to use and efficient (cell Task Success
phone), some product have to be (defibrillator, Task Time
emergency exit instructions on airplane)
Errors

ƒ Critical vs. noncritical product Efficiency


ƒ Entire reason for product’s existence is for user to Learnability
complete a very important task
Issue Based Metrics
ƒ Failure will have a significant negative outcome
ƒ Important that user performance measured against a Self Reported Metrics
target goal Behavioral and Physiological Metrics
ƒ If it doesn’t meet goal, it must be redesigned
ƒ Need larger number of users to have high degree of certainty Combined and Comparative Metrics

Live Website Metrics


ƒ Metrics Card Sorting Data
ƒ Errors
ƒ Number of errors or mistakes
ƒ Important to be explicit of what an error is and isn’t
ƒ Task Success
ƒ Binary approach recommended
ƒ Efficiency
ƒ May also want to tie success to other metrics such as
competition time without errors
ƒ Defibrillator example…

21

Choosing the Right Metrics


Ten Types of Usability Studies
Metrics
ƒ Creating an Overall Positive User Experience
Task Success
ƒ Not enough to be usable, want exceptional user
experience Task Time
ƒ Thought provoking, entertaining, slightly-addictive Errors
ƒ Performance useful, but what user thinks, feels,
Efficiency
and says really matters
Learnability

ƒ Metrics Issue Based Metrics

ƒ Self Reported Self Reported Metrics


ƒ Satisfaction – common but not enough Behavioral and Physiological Metrics
ƒ Exceed expectations – want user to say it was
easier, more efficient, or more entertaining than Combined and Comparative Metrics
expected Live Website Metrics
ƒ Likelihood to purchase, use in future
ƒ Recommend to a friend Card Sorting Data

ƒ Behavioral and Physiological


ƒ Pupil diameter
ƒ Heart rate
ƒ Skin conductance

22

11
Choosing the Right Metrics
Ten Types of Usability Studies
Metrics
ƒ Evaluating the Impact of Subtle Changes
Task Success
ƒ Impact on user behavior may not be clear, but
huge implications to the larger population Task Time
ƒ Font choice and size, placement, visual contrast, Errors
color, image choices
ƒ Terminology, content Efficiency

Learnability
ƒ Metrics Issue Based Metrics
ƒ Live Website Metrics Self Reported Metrics
ƒ A/B tests compares control design against
alternative Behavioral and Physiological Metrics
ƒ Compare traffic or purchases to the controlled Combined and Comparative Metrics
design
Live Website Metrics

Card Sorting Data

23

Choosing the Right Metrics


Ten Types of Usability Studies
Metrics
ƒ Comparing Designs
Task Success
ƒ Comparing more than one design alternative
ƒ Early in the design process teams put together Task Time
semi-functional prototypes Errors
ƒ Evaluate using predefined set of metrics
Efficiency

Learnability
ƒ Participants
Issue Based Metrics
ƒ Can’t ask same participant to perform same tasks
with all designs Self Reported Metrics

ƒ Even with counterbalancing design and task order, Behavioral and Physiological Metrics
information on valuable
Combined and Comparative Metrics

Live Website Metrics


ƒ Procedure
Card Sorting Data
ƒ Study as between-subjects, participant only works
with one design
ƒ Have primary design participant works with, show
alternative designs and ask for preference

24

12
Choosing the Right Metrics
Ten Types of Usability Studies
Metrics
ƒ Comparing Designs (continued)
Task Success

Task Time
ƒ Metrics
Errors
ƒ Task Success
ƒ Indicates which design more usable Efficiency
ƒ Small sample size, limited value
Learnability
ƒ Task Time
Issue Based Metrics
ƒ Indicates which design more usable
ƒ Small sample size, limited value Self Reported Metrics
ƒ Issue Based Metrics Behavioral and Physiological Metrics
ƒ Compare the frequency of high-, medium-, and low-
severity issues across designs to see which one most Combined and Comparative Metrics
usable Live Website Metrics
ƒ Self Reported Metrics
Card Sorting Data
ƒ Ask participant to choose the prototype they would
most like to use in the future (forced comparison)
ƒ As participant to rate each prototype along dimensions
such as ease of use and visual appeal

25

Other Study Details

ƒ Budgets and Timelines


ƒ Difficult to provide cost or time estimates for a any
particular type of study

ƒ General rules of thumb


ƒ Formative study
ƒ Small number of participants (≤10)
ƒ Little impact

ƒ Lab setting with larger number of participants (>12)


ƒ Most significant cost – recruiting and compensating
participants
ƒ Time required to run tests
ƒ Additional cost for usability specialists
ƒ Time to clean up and analyze data

ƒ Online study
ƒ Half of the time is spent setting up the study
ƒ Running online study requires little if any time for
usability specialist
ƒ Other half of time spent cleaning up and analyzing
data
ƒ 100-200 person-hours (50% variation)
26

13
Other Study Details

ƒ Evaluation Methods
ƒ Not restricted to certain type of method (lab test vs.
online test)
ƒ Choosing method based on how many participants
and what metrics you want to use

ƒ Lab test with small number of participants


ƒ One-on-one session between moderator and
participant
ƒ Participant thinking-aloud, moderator notes
participant behavior and responses to questions
ƒ Metrics to collect
ƒ Issue based metrics – issue frequency, type, severity
ƒ Performance metrics – task success, errors, efficient
ƒ Self-reported metrics – answer questions regarding
each task at the end of study
ƒ Caution
ƒ Easy to over generalize performance and self-reported
metrics without adequate sample size

27

Other Study Details

ƒ Evaluation Methods (continued)

ƒ Lab test with larger number of participants


ƒ Able to collect wider range of data because increased
sample size means increased confidence in data
ƒ All performance, self-reported, and physiological metrics
are fair game
ƒ Caution
ƒ Inferring website traffic patterns from usability lab data is
not very reliable
ƒ Looking at how subtle design changes impact user
experience

ƒ Online studies
ƒ Testing with many participants at the same time
ƒ Excellent way to collect a lot of data in a short time
ƒ Able to collect many performance, self reported
metrics, subtle design changes
ƒ Caution
ƒ Difficult to collect issue-based data, can’t directly observe
participants
ƒ Good for software or website testing, difficult to test
consumer electronics
28

14
Other Study Details

ƒ Participants
ƒ Have major impact in findings

ƒ Recruiting issues
ƒ Identifying the recruiting criteria to determine
if participant eligible for study
ƒ How to segment users
ƒ How many users are needed
ƒ Diversity of user population
ƒ Complexity of product
ƒ Specific goals of study
ƒ Recruiting strategy
ƒ Generate list from customer data
ƒ Send requests via email distribution lists
ƒ Third party
ƒ Posting announcement on website

29

Other Study Details

ƒ Data Collection
ƒ Plan how you are capturing data needed for study
ƒ Significant impact on how much work later when analysis
begins

ƒ Lab test with small number of participants


ƒ Excel works well
ƒ Have template in place for quickly capturing data during
testing
ƒ Data entered in numeric format as much as possible
ƒ 1 – success
ƒ 0 – failure
ƒ Everyone should know coding scheme extremely well
ƒ Someone flips scales or doesn’t understand what to enter
ƒ Throw out data or have to recode data

ƒ Larger studies
ƒ Use data capture tool
ƒ Helpful to have option to download raw data into excel

30

15
Other Study Details

ƒ Data Cleanup
ƒ Rarely in a format that is instantly ready to analyze
ƒ Can take anywhere from one hour to a couple of weeks

ƒ Cleanup tasks
ƒ Filtering data
ƒ Check for extreme values (task completion times)
ƒ Some participants leave in the middle of study, and times
are unusually large
ƒ Impossible short times may indicate user not truly engaged
in study
ƒ Results from users who are not in target population

ƒ Creating new variables


ƒ Building on raw data useful
ƒ May create a top-2-box variable for self-reported scales
ƒ Aggregate overall success average representing all tasks
ƒ Create an overall usability score

31

Other Study Details

ƒ Cleanup tasks (continued)


ƒ Verifying responses
ƒ Notice large percentage of participants giving the same wrong
answer
ƒ Check why this happens

ƒ Checking consistency
ƒ Make sure data capture properly
ƒ Check task completion times and success to self reported
metrics (completed fast but low rating)
ƒ Data captured incorrectly
ƒ Participant confused the scales of the question

ƒ Transferring data
ƒ Capture and clean up data in Excel, then use another program
to run statistics, then move to Excel to create charts and
graphs

32

16
Summary

ƒ Formative vs. summative approach


ƒ Formative – collecting data to help improve design before it is launched or released
ƒ Summative – want to measure the extend to which certain target goal were achieved

ƒ Deciding on the most appropriate metrics, take into account two main aspect of
user experiences – performance and satisfaction
ƒ Performance metrics – characterize what the user does
ƒ Satisfaction metrics - relate to what users think or feel about their experience

ƒ Budgets and timelines need to be planned well out in advance when running any
usability study

ƒ Three general types of evaluation methods used to collect usability data


ƒ Lab tests with small number of participants
ƒ Best for formative testing
ƒ Lab test with large number of participants (>12)
ƒ Best for capturing a combination of qualitative and quantitative data
ƒ Online studies with very large number of participants (>100)
ƒ Best to examine subtle design changes and preferences

33

Summary

ƒ Clearly identify criteria for recruiting participants


ƒ Truly representative of target group
ƒ Formative
ƒ 6 to 8 users for each iteration is enough
ƒ If distinct groups, helpful to have four from each group
ƒ Summative
ƒ 50 to 100 representative users

ƒ Plan how you are going to capture all the data needed
ƒ Template for quickly capturing data during test
ƒ Everyone familiar with coding conventions

ƒ Data cleanup
ƒ Manipulating data in a way to make them usable and reliable
ƒ Filtering removes extreme values or records that are problematic
ƒ Consistency checks and verifying responses make sure participant intensions map to their
responses

34

17

You might also like