Complementary material to the paper

Bringing Transparency Design into Practice (IUI'18)

Malin Eiband1, Hanna Schneider1, Mark Bilandzic2, Julian Fazekas-Con1, Mareike Haug1, Heinrich Hussmann1
1LMU Munich, 2Freeletics GmbH, Munich, Germany

Abstract: Intelligent systems, which are on their way to becoming mainstream in everyday products, make recommendations and decisions for users based on complex computations. Researchers and policy makers increasingly raise concerns regarding the lack of transparency and comprehensibility of these computations from the user perspective. Our aim is to advance existing UI guidelines for more transparency in complex real-world design scenarios involving multiple stakeholders. To this end, we contribute a stage-based participatory process for designing transparent interfaces incorporating perspectives of users, designers, and providers, which we developed and validated with a commercial intelligent fitness coach. With our work, we hope to provide guidance to practitioners and to pave the way for a pragmatic approach to transparency in intelligent systems.

To further illustrate the different stages of our process, we provide complementary material as to their application in the Freeletics design scenario.

(A) What to Explain: Expert Mental Model

What happens to the best of our knowledge? What can be explained? What does an expert mental model of the system look like?

From the insights of the workshop about the AI of the Freeletics Bodyweight Coach, we constructed the following expert mental model (see figure 1), i.e. an "optimal" version of a user mental model of the system's workings. This expert mental model comprises all key components of the AI grouped by data type, such as the user's profile, the feedback given by the user, or the fitness level. We identified performance class and training cycle as key components with the highest impact on the calculation of personalized workouts.

Expert mental model of the Freeletics Bodyweight Coach AI
Figure 1: The expert mental model of the Freeletics Bodyweight Coach AI resulting from a workshop with Freeletics employees.

(B) What to Explain: User Mental Model

How do users currently make sense of the system? What is the user mental model of the system based on its current UI? How does it differ from the expert mental model?

We conducted semi-structured interviews at popular workout spots in Munich to elicit the current user mental model of the app and the assembling of the training plan. Since it is likely that users' mental models differ based on their prior experiences and knowledge [2], this stage aims at condensing them into one overarching mental model that reflects the current beliefs and understandings of most users as a status quo of the system's transparency, and at identifying missing and erroneous key components in comparison to the expert mental model. In our case, we found that users already could make sense of key components like situational variables (marked in green, see figure 2), but did not know that BMI, performance class, training cycle and the training day influence the selection of workouts (marked in yellow). Instead, they believed that the feedback they give after each exercise is used by the algorithm (marked in red), which, at the time of this research, was not the case. The extracted user mental model thus turned out to be a good indicator of the extent to which the current UI already supports the underlying AI concepts, and to which there is still need for improvement.

User mental model of the Freeletics Bodyweight Coach AI
Figure 2: The current user mental model of the Freeletics Bodyweight Coach AI, based on the insights of semi-structured interviews with users.

(C) What to Explain: Synthesis – Target Mental Model

Which key components of the algorithm do users want to be made transparent in the UI? To what extent are users actually interested in the rationale behind the algorithm?

We used the missing and erroneous key components in the user mental model as a basis for assessing users' actual interest in the underlying AI concepts. For this purpose, we let users sort cards [5] with statements about how the key components of the AI influence their personal workout. As suggested by Pu and Chen [3] and Tullio et al. [4], we used conversational language and two levels of detail when creating the statements. Cards with low level of detail only stated that a component influences the AI calculation, cards with high level of detail also explained how this component is used by the AI. Figure 3 shows exemplary statements, figure 4 an overview of all cards sorted by participant X and Y.

Exemplary statements in two levels of detail
Figure 3: Exemplary statements about the workings of the AI with regard to performance class in two levels of detail (translated to English).
Card sorting by two participants
Figure 4: Card sorting by participant 1 (left) and 5 (right).

Our analysis revealed that participants were most interested in detailed explanations of performance class, training focus, workout times as well as training cycle. Adding these components to the already correct ones in the user mental model yields the target mental model (see figure 5), the basis for prototyping.

Target mental model of the Freeletics Bodyweight Coach AI
Figure 5: The target mental model of the Freeletics Bodyweight Coach AI, synthesized from the comparison of the expert to the user mental model and the users' actual interest in the algorithmic components.

(D) How to Explain: Iterative Prototyping

How can the target mental model be reached through UI design? How and where can transparency be integrated into the UI of the system?

Following the outcome of a brainstorming session, we developed two different click prototypes for the target mental model. We focused on performance class and training cycle within the scope of our project, since these components have the greatest impact on the selection of workouts and were at the same time among the most interesting ones from the users' perspective. Figure 6 shows the iterative development of the prototypes, from low to high fidelity.

The two prototypes developed within the scope of our project
Figure 6: The two prototypes developed within the scope of our project.

(E) How to Explain: Design Evaluation

How has the user mental model developed? Has the target mental model been reached?

Both prototypes where evaluated with Freeletics users, as described in the paper, and rated with the System Usability Scale [1].

REFERENCES
[1] Brooke, John. (1996). SUS – A Quick and Dirty Usability Scale. Usability Evaluation in Industry 189, 194 (1996), 4–7.
[2] Norman, D. A. (1983). Some Observations on Mental Models. Mental Models, 7(112), 7-14.
[3] Pu, P., and Chen, L. (2006). Trust Building with Explanation Interfaces. In Proceedings of the 11th International Conference on Intelligent User Interfaces (pp. 93-100). ACM.
[4] Tullio, J., Dey, A. K., Chalecki, J., and Fogarty, J. (2007). How it Works: A Field Study of Non-Technical Users Interacting with an Intelligent System. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 31-40). ACM.
[5] Wood, J. R., and Wood, L. E. (2008). Card Sorting: Current Practices and Beyond. Journal of Usability Studies, 4(1), 1-6.