Christos Katsanos | ckatsanos@ece.upatras.gr
Nikolaos Tselios | nitse@ece.upatras.gr
Nikolaos Avouris | avouris@ece.upatras.gr
Are Ten Participants Enough for Evaluating
Information Scent of Web Page Hyperlinks?
IFIP INTERACT | Uppsala, Sweden | 24-28 August, 2009
Purpose & Motivation
2
 A critical factor in web navigation is information scent
(Fu & Pirolli, 2007; Blackmon et al, 2005; Miller & Remington, 2004)
 user’s assessment of semantic relevance of navigation
options in a webpage
 Often, participants are called to evaluate scent by
providing ratings (Miller & Remington, 2004; Brumby & Howes,
2008)
 Remains unclear how many raters are required to
obtain representative estimates of information scent.
The Study: First Phase
3
Design & Procedures
 Web-based survey
 Rate semantic relevancy of all links to the provided
goal (1=poor relevance, 5=high relevance).
 101 participants
 8 navigation menus, 8 links each
4
6464 ratings
Analysis Methodology
 Reference case = Scent-ratings from 101 participants
 Select 10 random samples of different size N
 N = 2, 5, 10, 15, 20, 25, 30, 40 and 50
 [Samples-Ratings] VS [All 101 participants Ratings]
 Average Spearman Correlation
 How many raters are enough to represent the ratings
of the whole dataset? 5
Results
6
 10 raters
 84-90% total var.
Error Bars = (rMEAN ± rSD)2
 x2 raters
 still the same
 x3 raters
 +5% closer to whole
dataset
First-phase: Conclusion
 10 raters appear to be a cost-effective solution to
evaluate information scent without expense in
the quality of results
7
 But how close are scent-ratings of 10 participants
to observed navigation behavior?
The Study: Second Phase
8
Design & Procedures
 Eye-tracking user study
 Perform the same 8 navigation tasks
used in first-phase
 54 users (not involved in first-phase)
 Two measures of users’ behavior:
 clicks on each link
 fixations-adjusted-for-text-length on each link.
9
432
recordings
Analysis Methodology
 Reference case = Behavioral data from 54 users
 [Scent-ratings from samples of - 1st
phase] VS
[Measures of user’s navigation behavior - 2nd
phase]
 Average Spearman Correlation
 How many raters are enough to reach an acceptable
level of correlation with these two measures?
10
Results
11
 Clicks on each link
 r10-raters is 0.7% different from
r101-raters
 r101raters = 0.80, p<.01
 Fixations on each link
 r10-raters is 7.4% different
from r101-raters
 r101-raters = 0.40, ns
Error Bars = rMEAN ± rSD
Second-phase: Conclusion
 10 participants provide scent-ratings that are close to
 observed link-selection behavior (clicks)
 distribution of attention (fixations)
12
 However, scent-ratings should be used only as a
rough indicator of users’ distribution of attention
 rs = 0.40, ns
Summary & Questions
 Investigated the well-known debate of “how many
users” in the context of information scent evaluation
 Scent-ratings of 10 participants appeared to be
enough for a discount evaluation of information scent
13
More studies required in the context of highly specialized
domains and/or varied user group composition
Christos Katsanos | ckatsanos@ece.upatras.gr
EXTRA SLIDES
14
First-Phase: Question example
15
Second-Phase: How many
users are enough?
16
Clinks Count Observations Count

More Related Content

PPTX
Research association presentation section 3
PPT
Navigation design with respect to cognitive load
PDF
State of the art on the cognitive walkthrough method by MAHATODY, SAGAR and ...
PPTX
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...
PPTX
Collaborative Metric Learning (WWW'17)
PDF
Using GradeMark to improve feedback and involve students in the marking process
PDF
Improving evaluations and utilization with statistical edge nested data desi...
PPT
Usability Evaluation in Educational Technology
Research association presentation section 3
Navigation design with respect to cognitive load
State of the art on the cognitive walkthrough method by MAHATODY, SAGAR and ...
Comparative Evaluation of Two Interface Tools in Performing Visual Analytics ...
Collaborative Metric Learning (WWW'17)
Using GradeMark to improve feedback and involve students in the marking process
Improving evaluations and utilization with statistical edge nested data desi...
Usability Evaluation in Educational Technology

Viewers also liked (20)

PPT
Konstantinos Papamichalopoulos portfolio updated 2014 - Doukas Arts BTEC pres...
PPTX
Interact 2013 klm fa-v1
PPTX
Does Size Matter? Investigating the impact of mobile phone screen size
PPT
Virla et al presentation etpe
PPTX
Papahristos et al interact 2005 final
PPTX
Chi 2008 katsanos et al auto_cardsorter_final
PPTX
Don’t leave me alone: effectiveness of a framed wiki-based learning activity
PPTX
Tselios teeaph current_research_and_activities_2014_2015
PPTX
2015 Βιβλιομετρική επισκόπηση Τμημάτων Θετικών Επιστημών και Πολυτεχνικής Σχολής
PPTX
Celda2006 4 tselios
PPTX
Filippidi sall2010
PPT
2011 davrazos παρουσίαση wiki τελική
PPT
ΠΛΗ42 ΟΣΣ1
PPTX
PPTX
Σεμινάριο OMEP: Βασικές αρχές σχεδιασμού δικτυακών τόπων
PPTX
2015 Πτυχιακή εργασία / Χρήση Τεχνικών Learning Analytics για την εκτίμηση το...
PPTX
Εθισμός στο Διαδίκτυο
PPTX
Effectiveness of a framed wiki-based learning activity in the context of HCI ...
PDF
THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS
PPTX
Twitter in education
Konstantinos Papamichalopoulos portfolio updated 2014 - Doukas Arts BTEC pres...
Interact 2013 klm fa-v1
Does Size Matter? Investigating the impact of mobile phone screen size
Virla et al presentation etpe
Papahristos et al interact 2005 final
Chi 2008 katsanos et al auto_cardsorter_final
Don’t leave me alone: effectiveness of a framed wiki-based learning activity
Tselios teeaph current_research_and_activities_2014_2015
2015 Βιβλιομετρική επισκόπηση Τμημάτων Θετικών Επιστημών και Πολυτεχνικής Σχολής
Celda2006 4 tselios
Filippidi sall2010
2011 davrazos παρουσίαση wiki τελική
ΠΛΗ42 ΟΣΣ1
Σεμινάριο OMEP: Βασικές αρχές σχεδιασμού δικτυακών τόπων
2015 Πτυχιακή εργασία / Χρήση Τεχνικών Learning Analytics για την εκτίμηση το...
Εθισμός στο Διαδίκτυο
Effectiveness of a framed wiki-based learning activity in the context of HCI ...
THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS
Twitter in education
Ad

More from Nikolaos Tselios (13)

PPTX
Study id12322 global-internet-usage-statista-dossier
PPTX
4 vasikes arhes shediasmoy diktiakon topon
PPTX
5 addie model design development phase
PPTX
seminar econ_allhlepidrash anthrwpoy_ypologisth_v2
PPTX
heuristic evaluation example
PPTX
Διδακτορική διατριβή Αλτανοπούλου
PPTX
παρουσιαση σεμιναριο Revythi
PPTX
Hcicte2016 altanopoulou
PPTX
Using the internet to collect data_greek education departments_hindex_....pptx
PPTX
2016 sapsani parousiash-diplomatikh
PPTX
Phdprogress altanopoulou
PPTX
PhD progress_2015-2016 Altanopoulou
PPTX
Εισαγωγή στην αλληλεπίδραση Ανθρώπου Υπολογιστή; 2015 Σεμινάριο στο ΜΠΣ 'Εφαρ...
Study id12322 global-internet-usage-statista-dossier
4 vasikes arhes shediasmoy diktiakon topon
5 addie model design development phase
seminar econ_allhlepidrash anthrwpoy_ypologisth_v2
heuristic evaluation example
Διδακτορική διατριβή Αλτανοπούλου
παρουσιαση σεμιναριο Revythi
Hcicte2016 altanopoulou
Using the internet to collect data_greek education departments_hindex_....pptx
2016 sapsani parousiash-diplomatikh
Phdprogress altanopoulou
PhD progress_2015-2016 Altanopoulou
Εισαγωγή στην αλληλεπίδραση Ανθρώπου Υπολογιστή; 2015 Σεμινάριο στο ΜΠΣ 'Εφαρ...
Ad

Recently uploaded (20)

PDF
International_Financial_Reporting_Standa.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Trump Administration's workforce development strategy
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PDF
HVAC Specification 2024 according to central public works department
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
International_Financial_Reporting_Standa.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Trump Administration's workforce development strategy
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Hazard Identification & Risk Assessment .pdf
Virtual and Augmented Reality in Current Scenario
LDMMIA Reiki Yoga Finals Review Spring Summer
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
Weekly quiz Compilation Jan -July 25.pdf
Chinmaya Tiranga quiz Grand Finale.pdf
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Unit 4 Computer Architecture Multicore Processor.pptx
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
What if we spent less time fighting change, and more time building what’s rig...
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
HVAC Specification 2024 according to central public works department
FORM 1 BIOLOGY MIND MAPS and their schemes
Practical Manual AGRO-233 Principles and Practices of Natural Farming

Interact 2009 katsanos et al are 10 participants enough to evaluate scent

  • 1. Christos Katsanos | [email protected] Nikolaos Tselios | [email protected] Nikolaos Avouris | [email protected] Are Ten Participants Enough for Evaluating Information Scent of Web Page Hyperlinks? IFIP INTERACT | Uppsala, Sweden | 24-28 August, 2009
  • 2. Purpose & Motivation 2  A critical factor in web navigation is information scent (Fu & Pirolli, 2007; Blackmon et al, 2005; Miller & Remington, 2004)  user’s assessment of semantic relevance of navigation options in a webpage  Often, participants are called to evaluate scent by providing ratings (Miller & Remington, 2004; Brumby & Howes, 2008)  Remains unclear how many raters are required to obtain representative estimates of information scent.
  • 4. Design & Procedures  Web-based survey  Rate semantic relevancy of all links to the provided goal (1=poor relevance, 5=high relevance).  101 participants  8 navigation menus, 8 links each 4 6464 ratings
  • 5. Analysis Methodology  Reference case = Scent-ratings from 101 participants  Select 10 random samples of different size N  N = 2, 5, 10, 15, 20, 25, 30, 40 and 50  [Samples-Ratings] VS [All 101 participants Ratings]  Average Spearman Correlation  How many raters are enough to represent the ratings of the whole dataset? 5
  • 6. Results 6  10 raters  84-90% total var. Error Bars = (rMEAN ± rSD)2  x2 raters  still the same  x3 raters  +5% closer to whole dataset
  • 7. First-phase: Conclusion  10 raters appear to be a cost-effective solution to evaluate information scent without expense in the quality of results 7  But how close are scent-ratings of 10 participants to observed navigation behavior?
  • 9. Design & Procedures  Eye-tracking user study  Perform the same 8 navigation tasks used in first-phase  54 users (not involved in first-phase)  Two measures of users’ behavior:  clicks on each link  fixations-adjusted-for-text-length on each link. 9 432 recordings
  • 10. Analysis Methodology  Reference case = Behavioral data from 54 users  [Scent-ratings from samples of - 1st phase] VS [Measures of user’s navigation behavior - 2nd phase]  Average Spearman Correlation  How many raters are enough to reach an acceptable level of correlation with these two measures? 10
  • 11. Results 11  Clicks on each link  r10-raters is 0.7% different from r101-raters  r101raters = 0.80, p<.01  Fixations on each link  r10-raters is 7.4% different from r101-raters  r101-raters = 0.40, ns Error Bars = rMEAN ± rSD
  • 12. Second-phase: Conclusion  10 participants provide scent-ratings that are close to  observed link-selection behavior (clicks)  distribution of attention (fixations) 12  However, scent-ratings should be used only as a rough indicator of users’ distribution of attention  rs = 0.40, ns
  • 13. Summary & Questions  Investigated the well-known debate of “how many users” in the context of information scent evaluation  Scent-ratings of 10 participants appeared to be enough for a discount evaluation of information scent 13 More studies required in the context of highly specialized domains and/or varied user group composition Christos Katsanos | [email protected]
  • 16. Second-Phase: How many users are enough? 16 Clinks Count Observations Count

Editor's Notes

  • #2: Hi my name is Christos Katsanos, member of the HCI Group of University of Patras in Greece. I am very glad to be here to present you our work with the title “AutoCardSorter: Designing the Information Architecture of a Web Site Using Latent Semantic Analysis”
  • #3: The purpose of our work is to automate structural design of information spaces and therefore increase the efficiency &amp; flexibility for practitioners
  • #14: Summarizing we have proposed an approach that automates structural design of an information space. The validation study presented here depicted substantial effectiveness gain, with similar to a widely used user-based technique We argue that our approach is cheap, fast and easy and therefore has a greater possibility for wider adoption As a last note, ideally, the proposed approach should be complementary and by no means a substitute to user based techniques.