Human-AI Collaboration for UX Evaluation: Effects of Explanation and Synchronization (2112.12387v1)

Published 23 Dec 2021 in cs.HC

Abstract: Analyzing usability test videos is arduous. Although recent research showed the promise of AI in assisting with such tasks, it remains largely unknown how AI should be designed to facilitate effective collaboration between user experience (UX) evaluators and AI. Inspired by the concepts of agency and work context in human and AI collaboration literature, we studied two corresponding design factors for AI-assisted UX evaluation: explanations and synchronization. Explanations allow AI to further inform humans how it identifies UX problems from a usability test session; synchronization refers to the two ways humans and AI collaborate: synchronously and asynchronously. We iteratively designed a tool, AI Assistant, with four versions of UIs corresponding to the two levels of explanations (with/without) and synchronization (sync/async). By adopting a hybrid wizard-of-oz approach to simulating an AI with reasonable performance, we conducted a mixed-method study with 24 UX evaluators identifying UX problems from usability test videos using AI Assistant. Our quantitative and qualitative results show that AI with explanations, regardless of being presented synchronously or asynchronously, provided better support for UX evaluators' analysis and was perceived more positively; when without explanations, synchronous AI better improved UX evaluators' performance and engagement compared to the asynchronous AI. Lastly, we present the design implications for AI-assisted UX evaluation and facilitating more effective human-AI collaboration.

Citations (2)

View on Semantic Scholar

Summary

The paper shows that AI explanations significantly improve UX evaluators' accuracy in identifying usability issues.
It employs a 2x2 mixed-methods design with a Wizard-of-Oz AI Assistant to assess the effects of explanation and synchronization.
Findings reveal that synchronous interfaces and tailored explanations boost evaluators' engagement and satisfaction.

Human-AI Collaboration for UX Evaluation: Effects of Explanation and Synchronization

This essay discusses the paper titled "Human-AI Collaboration for UX Evaluation: Effects of Explanation and Synchronization" (2112.12387). The research focuses on understanding the effects of two design factors—explanations and synchronization—within AI-assisted user experience (UX) evaluation. The authors develop an AI tool called AI Assistant and investigate how these factors influence the performance and perceptions of UX evaluators tasked with analyzing usability test videos.

Introduction

Usability testing is critical in UX development, enabling evaluators to identify user experience issues during the interface and system development lifecycle [mcdonald2012exploring]. While AI has shown promise in assisting such tasks, the optimal design for effective collaboration between UX evaluators and AI remains under-explored. The paper examines two specific design factors from the literature on human-human and human-AI collaboration: explanations provided by AI and the temporal synchronization of interactions between users and AI.

Figure 1: The user interface of AI Assistant that presents the AI-suggested problems synchronously with explanations: (a) video player; (b) annotation panel; (c) identified problem table; (d) timeline of the AI-suggested problems; (e) explanations of the AI-suggested problems.

To this end, the authors developed AI Assistant, simulating an AI through a Wizard-of-Oz (WoZ) approach to provide reliable suggestions. This tool was designed with four user interfaces (UIs) to explore two factors: explanations (with/without) and synchronization (synchronous/asynchronous). A paper was conducted with 24 UX evaluators using AI Assistant to identify UX problems from usability test videos.

Methodology

Design of AI Assistant

AI Assistant was iteratively designed to aid UX evaluators by suggesting potential UX problems within usability test videos. The tool features a video player, an annotation panel, a problem table, an AI-suggested problems timeline, and, when applicable, an explanations panel. The WoZ approach simulated an AI capable of detecting UX problems with reasonable performance, facilitating a controlled paper environment.

Experimental Design and Procedure

A mixed-methods 2-by-2 experimental design was employed, with explanations as a between-subjects factor and synchronization as a within-subjects factor. Each participant was assigned to either an explanation or no explanation condition and completed tasks using both synchronous and asynchronous UIs (Figure 2).

Figure 2: Comparison of the four different versions of AI Assistant.

The paper included sessions with pre-task training, two formal tasks, and post-task interviews. Participants were veteran UX evaluators, ensuring expertise in UX research and usability testing. The paper gathered task performance, behavioral patterns during tool interaction, and subjective perceptions via questionnaires and interviews.

Results

Task Performance and Behavioral Patterns

Quantitative analysis reveals significant interactions between the provision of explanations and synchronization, with explanations improving the UX evaluators' understanding of AI processes and enhancing their problem detection accuracy. The provision of explanations resulted in a higher detection rate of false negatives, indicating robustness in evaluation despite AI oversight (Figure 3).

Figure 3: Two-way interaction between explanations and synchronization on the number of UX problems found, demonstrating the impact on Recall.

Behaviorally, synchronous AI without explanations led to higher evaluator engagement through increased time spent and active video interaction. This synchronization style was preferred as it reduced bias and cognitive overload associated with asynchronous presentation (Figure 4).

Figure 4: Two-way interaction between explanations and synchronization on the time spent on analysis (normalized by video length).

Subjective Perceptions

Participants reported enhanced satisfaction and understanding when AI explanations were present. The ability to cross-verify AI-suggested problems with personal judgments was highly valued. While the effect on trust was not significantly observable, participants' attitudes indicate potential trust improvements linked to better AI understanding.

Implications for Human-AI Collaboration

The paper suggests that AI, by offering justifications and timely insights, can serve as a valuable partner in UX evaluations rather than as a stand-in for human expertise. While AI explanations facilitate understanding and task efficiency, synchronization enhances user engagement and preserves human agency. Design strategies should ensure AI explanations are nuanced, offering flexibility for personalized evaluator needs. Future work may focus on adaptive AI systems that leverage user feedback for learning and customization, potentially minimizing the limitations of current WoZ simulations.

Lastly, applicability to broader domains in high-stakes decisions remains an open area, as the "junior colleague" metaphor may extend differently where agency and user trust in AI are viewed differently.

Conclusion

This paper demonstrates the nuanced impacts of AI explanations and synchronization on enhancing UX evaluators' performance and engagement in analyzing usability test videos. By carefully designing AI tools that balance intelligent assistance with human analytical skills, future AI systems can be optimized for more effective and efficient human-AI collaboration, with potential implications across UX evaluation and other domains requiring nuanced judgment and decision-making.