IAT 334 Experimental Evaluation ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY

  • Published on

  • View

  • Download

Embed Size (px)


<ul><li> Slide 1 </li> <li> IAT 334 Experimental Evaluation ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT] | WWW.SIAT.SFU.CA </li> <li> Slide 2 </li> <li> March 10, 2011IAT 3342 Evaluation Evaluation styles Subjective data Questionnaires, Interviews Objective data Observing Users Techniques, Recording Usability Specifications Why, How </li> <li> Slide 3 </li> <li> March 10, 2011IAT 3343 Our goal? </li> <li> Slide 4 </li> <li> March 10, 2011IAT 3344 Evaluation Earlier: Interpretive and Predictive Heuristic evaluation, walkthroughs, ethnography Now: User involved Usage observations, experiments, interviews... </li> <li> Slide 5 </li> <li> March 10, 2011IAT 3345 Evaluation Forms Summative After a system has been finished. Make judgments about final item. Formative As project is forming. All through the lifecycle. Early, continuous. </li> <li> Slide 6 </li> <li> March 10, 2011IAT 3346 Evaluation Data Gathering Design the experiment to collect the data to test the hypothesis to evaluate the interface to refine the design Information we gather about an interface can be subjective or objective Information also can be qualitative or quantitative Which are tougher to measure? </li> <li> Slide 7 </li> <li> March 10, 2011IAT 3347 Subjective Data Satisfaction is an important factor in performance over time Learning what people prefer is valuable data to gather </li> <li> Slide 8 </li> <li> March 10, 2011IAT 3348 Methods Ways of gathering subjective data Questionnaires Interviews Booths (eg, trade show) Call-in product hot-line Field support workers </li> <li> Slide 9 </li> <li> March 10, 2011IAT 3349 Questionnaires Preparation is expensive, but administration is cheap Oral vs. written Oral advs: Can ask follow-up questions Oral disadvs: Costly, time-consuming Forms can provide better quantitative data </li> <li> Slide 10 </li> <li> March 10, 2011IAT 33410 Questionnaires Issues Only as good as questions you ask Establish purpose of questionnaire Dont ask things that you will not use Who is your audience? How do you deliver and collect questionnaire? </li> <li> Slide 11 </li> <li> March 10, 2011IAT 33411 Questionnaire Topic Can gather demographic data and data about the interface being studied Demographic data: Age, gender Task expertise Motivation Frequency of use Education/literacy </li> <li> Slide 12 </li> <li> March 10, 2011IAT 33412 Interface Data Can gather data about screen graphic design terminology capabilities learning overall impression ... </li> <li> Slide 13 </li> <li> March 10, 2011IAT 33413 Question Format Closed format Answer restricted to a set of choices Characters on screen hard to read easy to read 1 2 3 4 5 6 7 </li> <li> Slide 14 </li> <li> March 10, 2011IAT 33414 Closed Format Likert Scale Typical scale uses 5, 7 or 9 choices Above that is hard to discern Doing an odd number gives the neutral choice in the middle </li> <li> Slide 15 </li> <li> March 10, 2011IAT 33415 Closed Format Advantages Clarify alternatives Easily quantifiable Eliminates useless answers Disadvantages Must cover whole range All should be equally likely Dont get interesting, different reactions </li> <li> Slide 16 </li> <li> March 10, 2011IAT 33416 Issues Question specificity Do you have a computer? Language Beware terminology, jargon Clarity Leading questions Can be phrased either positive or negative </li> <li> Slide 17 </li> <li> March 10, 2011IAT 33417 Issues Prestige bias People answer a certain way because they want you to think that way about them Embarrassing questions Hypothetical questions Halo effect When estimate of one feature affects estimate of another (eg, intelligence/looks) </li> <li> Slide 18 </li> <li> March 10, 2011IAT 33418 Deployment Steps Discuss questions among team Administer verbally/written to a few people (pilot). Verbally query about thoughts on questions Administer final test </li> <li> Slide 19 </li> <li> March 10, 2011IAT 33419 Open-ended Questions Asks for unprompted opinions Good for general, subjective information, but difficult to analyze rigorously May help with design ideas Can you suggest improvements to this interface? </li> <li> Slide 20 </li> <li> March 10, 2011IAT 33420 Ethics People can be sensitive about this process and issues Make sure they know you are testing software, not them Attribution theory Studies why people believe that they succeeded or failed--themselves or outside factors (gender, age differences) Can quit anytime </li> <li> Slide 21 </li> <li> March 10, 2011IAT 33421 Objective Data Users interact with interface You observe, monitor, calculate, examine, measure, Objective, scientific data gathering Comparison to interpretive/predictive evaluation </li> <li> Slide 22 </li> <li> March 10, 2011IAT 33422 Observing Users Not as easy as you think One of the best ways to gather feedback about your interface Watch, listen and learn as a person interacts with your system </li> <li> Slide 23 </li> <li> March 10, 2011IAT 33423 Observation Direct In same room Can be intrusive Users aware of your presence Only see it one time May use semitransparent mirror to reduce intrusiveness Indirect Video recording Reduces intrusiveness, but doesnt eliminate it Cameras focused on screen, face &amp; keyboard Gives archival record, but can spend a lot of time reviewing it </li> <li> Slide 24 </li> <li> March 10, 2011IAT 33424 Location Observations may be In lab - Maybe a specially built usability lab Easier to control Can have user complete set of tasks In field Watch their everyday actions More realistic Harder to control other factors </li> <li> Slide 25 </li> <li> March 10, 2011IAT 33425 Challenge In simple observation, you observe actions but dont know whats going on in their head Often utilize some form of verbal protocol where users describe their thoughts </li> <li> Slide 26 </li> <li> March 10, 2011IAT 33426 Verbal Protocol One technique: Think-aloud User describes verbally what s/he is thinking and doing What they believe is happening Why they take an action What they are trying to do </li> <li> Slide 27 </li> <li> March 10, 2011IAT 33427 Think Aloud Very widely used, useful technique Allows you to understand users thought processes better Potential problems: Can be awkward for participant Thinking aloud can modify way user performs task </li> <li> Slide 28 </li> <li> March 10, 2011IAT 33428 Teams Another technique: Co-discovery learning Join pairs of participants to work together Use think aloud Perhaps have one person be semi-expert (coach) and one be novice More natural (like conversation) so removes some awkwardness of individual think aloud </li> <li> Slide 29 </li> <li> March 10, 2011IAT 33429 Alternative What if thinking aloud during session will be too disruptive? Can use post-event protocol User performs session, then watches video afterwards and describes what s/he was thinking Sometimes difficult to recall </li> <li> Slide 30 </li> <li> March 10, 2011IAT 33430 Historical Record In observing users, how do you capture events in the session for later analysis? </li> <li> Slide 31 </li> <li> March 10, 2011IAT 33431 Capturing a Session 1. Paper &amp; pencil Can be slow May miss things Is definitely cheap and easy Time 10:00 10:03 10:08 10:22 Task 1 Task 2 Task 3 SeSe SeSe </li> <li> Slide 32 </li> <li> March 10, 2011IAT 33432 Capturing a Session 2. Audio tape Good for talk-aloud Hard to tie to interface 3. Video tape Multiple cameras probably needed Good record Can be intrusive </li> <li> Slide 33 </li> <li> March 10, 2011IAT 33433 Capturing a Session 4. Software logging Modify software to log user actions Can give time-stamped key press or mouse event Two problems: Too low-level, want higher level events Massive amount of data, need analysis tools </li> <li> Slide 34 </li> <li> March 10, 2011IAT 33434 Assessing Usability Usability Specifications Quantitative usability goals, used a guide for knowing when interface is good enough Should be established as early as possible in development process </li> <li> Slide 35 </li> <li> March 10, 2011IAT 33435 Measurement Process If you cant measure it, you cant manage it Need to keep gathering data on each iterative refinement </li> <li> Slide 36 </li> <li> March 10, 2011IAT 33436 What to Measure? Usability attributes Initial performance Long-term performance Learnability Retainability Advanced feature usage First impression Long-term user satisfaction </li> <li> Slide 37 </li> <li> March 10, 2011IAT 33437 How to Measure? Benchmark Task Specific, clearly stated task for users to carry out Example: Calendar manager Schedule an appointment with Prof. Smith for next Thursday at 3pm. Users perform these under a variety of conditions and you measure performance </li> <li> Slide 38 </li> <li> March 10, 2011IAT 33438 Assessment Technique Usability Measure Value to Current Worst Planned Best poss Observ attribute instrument be measured level acc level target level level results Initial Benchmk Length of 15 secs 30 secs 20 secs 10 secs perf task time to (manual) success add appt on first trial First Quest -2..2 ?? 0 0.75 1.5 impression </li> <li> Slide 39 </li> <li> March 10, 2011IAT 33439 Summary Measuring Instrument Questionnaires Benchmark tasks </li> <li> Slide 40 </li> <li> March 10, 2011IAT 33440 Summary Value to be measured Time to complete task Number of percentage of errors Percent of task completed in given time Ratio of successes to failures Number of commands used Frequency of help usage </li> <li> Slide 41 </li> <li> March 10, 2011IAT 33441 Summary Target level Often established by comparison with competing system or non-computer based task </li> <li> Slide 42 </li> <li> Ethics Testing can be arduous Each participant should consent to be in experiment (informal or formal) Know what experiment involves, what to expect, what the potential risks are Must be able to stop without danger or penalty All participants to be treated with respect Nov 2, 2009IAT 33442 </li> <li> Slide 43 </li> <li> Consent Why important? People can be sensitive about this process and issues Errors will likely be made, participant may feel inadequate May be mentally or physically strenuous What are the potential risks (there are always risks)? Examples? Vulnerable populations need special care &amp; consideration (&amp; IRB review) Children; disabled; pregnant; students (why?) Nov 2, 2009IAT 33443 </li> <li> Slide 44 </li> <li> Before Study Be well prepared so participants time is not wasted Make sure they know you are testing software, not them (Usability testing, not User testing) Maintain privacy Explain procedures without compromising results Can quit anytime Administer signed consent form Nov 2, 2009IAT 33444 </li> <li> Slide 45 </li> <li> During Study Make sure participant is comfortable Session should not be too long Maintain relaxed atmosphere Never indicate displeasure or anger Nov 2, 2009 IAT 33445 </li> <li> Slide 46 </li> <li> After Study State how session will help you improve system Show participant how to perform failed tasks Dont compromise privacy (never identify people, only show videos with explicit permission) Data to be stored anonymously, securely, and/or destroyed Nov 2, 2009 IAT 33446 </li> <li> Slide 47 </li> <li> March 10, 2011IAT 33447 One Model </li> </ul>