Based on the analysis of the online workshops and retrospective interviews, we describe how the computational notebooks supported participants in applying a sensitive approach to sharing lived experiences as part of group discussions, iteratively learning about ML concepts, anticipating potential benefits and harms of an example ML model, and collaboratively informing the design of alternative ML models according to personal health and wellbeing needs. We then unpack how the user experience of the computational notebook also caused critical incidents and led to information asymmetries and power imbalances.
4.1 Perceived Benefits of Using Computational Notebooks as a Co-Design Tool
4.1.1 Supporting Individual Ideas and Collective Conversations
The computational notebook enabled participants to formulate individual ideas by working through each of the five sections by themselves, before group discussion occurred and in parallel to group discussion. Although workshop participants were referencing the same content, everyone having their own notebook allowed space for original and uninfluenced thought. Y6 reported enjoying the fact that they had a “private” environment “to test those ideas before sharing”, without a fear of “judgment” of the way they interacted with workshop activities (Y6I). When faced with interactive activities, participants had the freedom to input their own ideas, such as personal parameters in the risk score slider, which enriched their personal understanding of an activity. Private use of the notebooks was complemented by the fact that participants had different lived experiences of T1D. Thus, they drew conclusions from an activity as an individual, prior to discussion (“I thought it was interesting, I put in my parameters on it, and saw what I would have.” (Y1I)). By affording participants more agency in their comprehension, they had a deeper understanding of the concept at hand before discussing with the group. This was a powerful affordance because it approach meant that participants were not bound to the pacing of the group and could take their time with an activity if desired, which would not be the case if, for example, the group was following a slideshow presented by the host (“there's just something nice about giving each individual the capability of, in their own time, flicking through each day and having to think about it.” (C1I)). A young adult with T1D also identified that there would have been less potential for them to bring their own ideas into an activity had they been sharing one notebook as a group, which would have limited their creativity and ability to explore the activity:
“I think there's a potential that people could be overly dominant, if it's just one notebook, [...], I think that there is a possibility that people wouldn't be as creative in trying out ideas.” (Y6I)
Features of the computational notebooks kept conversation on track and provided stimulus to allow fruitful discussion. The visual elements of the notebook provided prompts for the participants to talk about, which was much more effective than discussing a topic without something tangible to compare it to (“I think having that information there in front of us focused us and focused the discussion.” (C3I)). The static and interactive visualisations were used by participants as “jumping off points” (Y2I) to ponder deeper questions and discuss what was not shown in the computational notebook. Y1I pointed out that group discussion gave them time to think deeper on a subject and further develop their own ideas as a result.
“I think it was nice to be in a group rather than individual because sometimes you could sit back and properly think about it. [...] the discussion probably brought more out than if you had just asked me on my own, because sometimes what they were saying helped to push me to say more.” (Y1I)
4.1.2 Facilitating a Calm Approach to Sensitive Topics
The workshops approached sensitive and controversial topics surrounding T1D, including: health risk perceptions; roles of metrics, such as BMI and HbA1c; individuals’ choice of technology; and the individual nature of diabetes. While many of these topics have stigma attached to them, the computational notebook was appropriated as a ticket-to-talk about a given topic, with control over concealing and revealing personal experiences:
“I think [the dataset activity] was a really good way to sort of hammer down the point that everybody's diabetes and their diabetes control is so vastly different, with people using different tech as well. I think it was a really good conversation starter for that.” (Y6I)
The notebook helped to facilitate the discussion of these sensitive topics by clearly displaying information, and then inviting open discussion, rather than railroading the conversation down a specific narrative path. For example, a young adult with T1D challenged the acceptability and significance of the ML feature “body mass index” (BMI) and a family carer referred to conversations on how inappropriate representation of population groups can lead to health inequalities as part of the dataset session:
“It was really clear that making false assumptions and making decisions, particularly when it comes to medical issues and clinical need, based on a dataset that doesn't represent a particular subset, and applying that decision to that subset is not just unhelpful, but potentially harmful.” (F4I)
Additionally, participants would find familiarity not only in the way that data was presented as charts and tables, but in the content of the data itself, explaining that “there isn't any embarrassment” when discussing sensitive topics, but in fact “there is commonality” between participants (Y6I). It is a positive that participants felt empowered by the content contained in the notebook and felt comfortable addressing sensitive topics:
“I think the way that people opened up individually meant that everybody felt pretty comfortable sharing their different perspectives. I think these three charts definitely facilitated and opened the conversation because of the way it represented things that people recognise from their lives.” (F2I)
4.1.3 Anticipating Potential Benefits and Harms of an Example ML Model
A further benefit of deploying the computational notebook as a design probe was that it supported participants in anticipating potential benefits and harms of the example ML model and the example risk score visualisation. Participants valued the ML model feature importance plot as a visual way to learn how the ML model predicted risk and appreciated the simplicity of a risk score as a boundary object to foster discussion.
Considering that hospital admissions often highlight a need for clinical intervention and that resources are typically limited, clinicians gauged whether the ML model could not only help identify most vulnerable patient groups and conduct targeted interventions but also help allocate resources more efficiently and equitably. If aligned with clinical workflows, the ML model could inform the quality assessment of clinical practice and effectiveness of “education” (C1W referring to session that covered self-management). Short-term and longer-term changes in risk score measures could support clinical decision-making between appointments and inform self-care in daily life:
“It should be something that can be looked at, uhm, can be looked at by the patient whenever they want, and the clinical teams when they want, but also proactively, you know, flagging changes in the risk or to just highlight people who maybe on the move to somewhere that's not as safe or as pleasant as where they are at the moment.” (C3W)
Participants highlighted the importance of aligning the ML model with life transitions and evolving priorities in daily life. For example, a young adult highlighted their priorities in daily life, such taking part in social life, and explained that a ML model that predicts health risk could help coping with the emotional challenges of managing long-term effects of T1D, if the AI system was personalised and provided customisable health risk alerts:
“Most people who are young diabetics they're trying to juggle their social life, their work, life, education. They do think well, one hypo, they might have one three times a week and think: Yeah, it's not ideal, but it's what it is. But it's when you look at it from the long run, that's doing irreversible damage. So, it's kind of preventing these risks in the first place is important, you know. And even the idea of ketoacidosis that is scary. So, to have the peace of mind that you're not going to get to that stage with the machine doing it [predicting health risk], it just takes pressure away off.” (Y4W)
In contrast, participants also anticipated potential harms of the example ML model. Taking a constrictive attitude, participants addressed the limitations of the dataset used to train the example ML model, such as a lack of transferability due to structural and cultural differences between the geographical origin of the dataset and participants’ local context. Clinicians highlighted that derived ML model features from the dataset, such as HbA1c, were already integral part of existing clinical practices and that ML model features, such as BUN (kidney function indicator), were not tailored to certain patient groups, such as teenagers and young adults.
All participants felt that the emotive nature of the applied terminology ‘risk’ could potentially cause emotional distress when presented to young adults and family carers outside clinical settings. People with T1D and their families may feel uncomfortable in sharing their personal health records to inform the ML model, in particular, when personal health predictions and clinical decisions are being made without their ongoing consent and input. The risk score could be “another number to be judged by” (C1W) and “another metric to live up to” (C3W). People's perception of the risk score might change over time and depend on the accuracy of predictions according to changes in daily life.
Additionally, participants focused attention to mismatches between binary ML model features and their lived experiences. Both, clinicians and young adults made clear that ML model features, such as whether a person has a chronic health condition or whether they use a CGM device, cannot (and should not) define people's health risk and quality of care. Participants highlighted the idiosyncratic nature of T1D to express scepticism of implementing ML models according to the realities of living with T1D (e.g., “Diabetes is so unpredictable - a ML model most likely won't be able to account for everything!” (Y5W)). They anticipated that an aggregated health risk score could lead to unintended health behaviour, including adverse data dependencies, over-obsession, and feelings of guilt, shame, and blame. A young adult exemplified this theme by explaining that the “management of diabetes varies between every person, a model that classifies people as high risk may stress newly diagnosed people into overcompensating with insulin doses” (Y3W).
4.1.4 Scaffolding Step-by-Step Learning and Multiple Learning Preferences
Due to the structure of the notebook, an environment was created that encouraged a step-by-step approach to learning ML concepts, and how to use the notebook itself. While they were separate learning outcomes, they do appear to be intertwined, as a positive user experience with the notebook seemed to have resulted in a positive understanding of ML concepts being taught. In terms of developing understanding of ML concepts, multiple participants identified that the notebook “built up into thinking what would your ideal model be” (F2I), and that this learning process meant that “people really were engaged in thinking of everything they could that would possibly have an influence” on their design for the final activity: creating a fictional feature importance plot. Since the final task produced important empirical data, it was vital that participants had a thorough understanding of ML before attempting the activity. Y4 described how the explanations provided by the notebook elements throughout the sessions had supported their understanding of ML applications to diabetes care. They reported that completing the previous activities was important, in order to consolidate their understanding, before moving on to the final activity:
“I think it built up quite nicely. [The explanations] helped me to think about what sort of inputs would go into [the model], and how it was important to ensure that that was right before starting out with a big model that could be clinically used. I would have got used to being thrown straight into machine learning, but I guess it was useful to start more basic.” (Y4I)
The other aspect of step-by-step learning was building confidence in the use of Jupyter Notebook itself. For those who had no prior knowledge of using a computational notebook, it was important to teach the participants how to interact with the notebook and internalise the process of running code cells first, to not overwhelm them when introducing the more confounding coding task. F4 explained that their first impression of the notebook was total unfamiliarity, and that the notebook appeared “alien” to them. They reinforced the finding that introducing the coding task too early would have overwhelmed participants, as it would have been too many new things to do in one session, and they appreciated the “gradual” approach to using the notebook.
Participants reported that they enjoyed the fact that the notebook catered to multiple ‘learning styles’. Despite this phrase not being part of the interview guide, multiple participants used the term “visual learner” (Y3, F3) to describe themselves or others, while other participants suggested ideas of “implicit” (Y5, Y6) and “explicit” (Y5, Y6) learning. Y5 identified that the feature importance plot “wasn't interactive and it was a bit more explicit rather than implied”, unlike the risk score slider which facilitated implicit learning due to its interactive nature. Y5, who had prior knowledge of ML as a computer science student, expressed that they preferred the explicit feature importance plot because “it was a bit more in depth at showing that these models can be made of plenty of different features with various weightings” (Y5I).
It became clear that it was important to participants to have a blend of both implicit and explicit coaching, because the learning styles complement each other and allow for a broad, but also deep understanding. It was important for them to first establish a baseline understanding through implicit learning, before explicitly “digging into it” (Y6I) by looking at more detailed examples that pushed the boundary of the participant's understanding. As someone without prior knowledge of ML, Y6 explained how the interactive, implicit activities provided necessary foundational knowledge, before that knowledge could be expanded on by studying more explicit information: “Having the implicit first to understand how it works and then digging into it was the right approach for me.” (Y6I).
C3 praised the way that the notebook incorporated and catered towards these different ways of digesting information, “I think different ways of putting information together suits different people differently” (C3I). Y3 identified themselves as a visual learner and described how the visual information combined with the aural aspect of discussion, supported them in learning about how ML models learn how to make predictions:
“[The notebook elements] obviously gave me visuals, which I said I learned by [...]. I found the combination of using the notebook, and then speaking with the team and obviously the [researchers] who were leading the sessions, the combination of the two helped me fully understand it and then I was able to give the best answers I could.” (Y3I)
Even though F3 identified themselves as a “reading person”, they still stressed the importance of including visual representations to support understanding:
“Without anything visual - and I'm not a visual learner, in all honesty, I'm a reading person - if you'd [explained] to me a machine learning model without those things, I would have been [sic] “What on earth are you talking about?” It would have made literally no sense.” (F3I)
Interactive visualisations seemed to fulfil the criteria for hands-on, experimental learning, and participants were vocal about how the interactivity of the notebook was the standout feature; the aspect that elevated it when comparing computational notebooks to other presentation tools, such as a PowerPoint slideshow (“The interactive models where we could enter parameters and see the changes ourselves – I thought that was great.” (F2I)). When describing the interactive risk score slider, F2 enthused, “I'm a sucker for things you can slide around and see the impact of, it's almost like gaming, isn't it, which I think is one of the engaging elements of using the notebook approach” (F2I).
4.1.5 Harnessing of Coding by Non-Data Science Experts
Presenting the computational notebook, a technical tool, to non-data science experts was understandably difficult and met with apprehension. Participants, such as F4, who had no prior experience with coding, described their first impressions of the notebook as “daunting” (F4I), and F1 expressed, “when I opened it for the first time, I was maybe worried a little bit and not sure how I was going to cope with it” (F1I.) However, with guidance from the workshop facilitators, and by approaching the notebook slowly and with step-by-step principles, participants who were initially put off were still confident in giving the notebook their best attempt. A young adult described the importance of having a coach to help when coding in the following way: “it was really good that we had a supervisor in to nudge us in the right direction if we needed help with it” (Y6I).
F3 described how they struggled, at first, to assign numerical weightings to features because of their fictional nature. Upon initially coding values and generating a graph, F3’s understanding was rapidly developed by the visual representation of their figures; they could now visualise how the model's features were weighted, relative to each other, and iteratively adjust values until it looked right to them: “When I initially did the features of importance, I remember thinking all these are just pie in the sky figures, not really sure, and I went back and found I could change them quite easily and quite quickly as well” (F3I). While they noted that the coding was not necessarily the most intuitive way of creating a graph, F4 also described the benefit of the instant feedback from changing the code:
“It's the visualization of it that when you type it in it, it shows the impact of changing the value immediately. Whereas I think other ways are more intuitive in terms of, yeah, we all know how to fill in a form or a Google doc or whatever, but then you wouldn't necessarily see what the graph then looks like until you have done it all, and then you'd have to go back and start all over again.” (F4I)
F4 reiterated that by the time the coding task came around, they found the task “relatively straightforward” and although they did make mistakes while creating code, they persevered and managed to succeed due to the gradual approach to the task:
“I think if this had come further up in the workshops and it was in the first or second [session], I think I'd have really struggled. It was better being in the last one and it was a little daunting, but relatively straightforward to do. I think I got tripped up and I don't know if I was the only one tripped up by adding in features and getting caught out by not putting commas or quotation marks in the right places.” (F4I)
A positive finding was the reactions of participants who had never coded before, after succeeding in the task. Upon their first impression of the notebook, F3 noted frustration and despair at the perceived technicality of the environment, “when I first saw it, I thought ‘Oh hell, I'm never going to get to grips with this,’ because it looks very scientific to me and a bit geeky, computer science-y, and nerdy” (F3I). However, when eventually tasked with writing code, F3 discovered that the task was not as difficult as they first thought, “I was just following the steps and when I tried it the first time I thought, ‘oh I can do it, that's good!’” F3 went on to report that the coding task ended up being their favourite activity of all the workshops, and that they gained satisfaction by being able to code successfully.
“I actually quite liked doing that coding thing when it came down to it. I thought I was going to hate it, but I quite liked the interactivity of it [...] I had quite a little bit of a sense of satisfaction about being able to do it. I know it's ridiculous, but I was like, ‘Oh yeah, this is easy actually.’” (F3I)
Y6, who had no prior coding experience, also reported similar feelings of empowerment and jubilation when successfully coding. Their success with their coding enabled them to produce the most comprehensive feature importance graph out of all participants, with fourteen features in total. Because the example code only had placeholders for five features, this demonstrates that they had an adept understanding of altering code and adding elements to an array:
“For me I had a couple of issues at the start, because I wanted to do more factors compared to what was initially put into the code, but I felt like I did alright with it, and as I got into it, I picked it up pretty quickly with how to stick the right things in. [...] [I felt] really, really proud of myself because I'm not techy. I can work a computer - IT is absolutely fine - but coding is something that I never thought I'd be able to do. Because I don't understand it, [...] it is quite intimidating. Especially with a big lack of women in STEM, sometimes it's hard to pique that interest of, ‘actually, it's not as intimidating as you think it is, you can do it.’” (Y6I)
4.1.6 Collaboratively Informing the Design of Alternative ML Models
Creating a fictional feature importance plot was a participatory and collaborative learning experience (see Figure
7).Participants were positive in describing how the group discussion about the notebook content furthered their own understanding of the task, and how sharing ideas helped to inform their own activities. For example, Y3 described how they were inspired by other participants’ feature importance plots, and how they used other people's ideas to improve their own fictional ML model features:
“I thought it was really insightful to see other people's priorities, because sometimes people have very different priorities in terms of their condition [...]. It helped me fill out my graph a bit more - there was things I didn't even think I wanted, and I was like, ‘oh, that's interesting, I'll put that in mine.’” (Y3I)
Participants were reassured that they were on the right track with their activity when sharing their ideas and outcomes within their group. Since their child had only been diagnosed with diabetes for a year at the time of the workshops, F3 reported concerns that they initially felt that they didn't have the knowledge to complete a feature importance plot. They felt that they were missing something that other family carers, “whose kids had been diagnosed like five or seven years,” would have, and that the other participants were “in a different place and they have a better understanding of [diabetes]”. Thus, when the participants were sharing their feature importance plot, and F3 found that “despite so many different levels of experience, people were pulling out the same things,” they were reassured that their opinion, experience, and position as a participant in the workshops, was justified.
Receiving inspiration from workshop peers, participants used their fictional ML model plots to (1) articulate and share their personal health and wellbeing needs; (2) highlight the limitations of current medical devices and consumer health technologies; and (3) describe desirable functionalities of future AI-driven systems for T1D management. Most participants did not take over the features of the example ML model: one young adult, one family carer, and two clinicians reused two of the seven features of the example ML model, namely HbA1c and gender.
Rather than adopting the example ML model, participants defined a wide range of holistic ML model features (see Table
2), from blood glucose levels, insulin measures, and adverse health events (i.e., hypos) to diet, physical activity, sleep, stress, and hormonal changes. Participants’ fictional ML model features showcase different writing styles, including conversational (e.g., Y6W:
“any hypos during day?”), technology savvy (e.g., F4W: “
External temp: auto-calc”), and clinical (e.g., C1W:
“presence of complications”). Importantly, these ML model features address potential tensions between automation and people's agency, as shown by C1 who highlighted clinician's roles in making decisions (e.g., C1W:
“clinical concerns”).
The fictional ML model feature importance plots display participant's creativity in appropriating the activity in personally meaningful ways. They defined not only feasible ML model features but also articulated the need for specific functionalities of T1D technology, such as a reminder to inject (Y2), highlighted important considerations for conducting digital health inventions in real world setting, such as addressing stigma in school setting (F1), and envisioned futuristic data collection technology, such as under-skin monitors to track blood glucose direction and speed (F4).
Moreover, participants suggested alternative directions for implementing agency and wellbeing supportive AI systems to “get on with life” (F3W), including predicting time in range (C3), recognising patterns in long-term CGM data (C1), predicting hypoglycaemia and hyperglycaemia (Y4), recommending suitable injection sites (Y1), recommending intensity of physical exercises (Y4), assessing foot health (Y1), and predicting adverse mental health states in family carers (F2).
4.2 Perceived Challenges of Using a Computational Notebooks as Co-Design Tool
4.2.1 Technical User Experience Can Contribute to Power Imbalances
The user interface of the computational notebook was unfamiliar for many participants and it did not meet their expectations for a user-friendly, aesthetically pleasing tool. F3 criticised the visual design of the notebook, explaining that the layout generated by the markdown file was not “very visually appealing” and that “it kind of looked like a textbook” (F3I). They explained how this ‘textbook’ was off putting to non-technical users: “Textbooks scare me a bit because I'm not a maths or science person!” (F3I) Although all participants proactively shared their experiences and created their own fictional feature importance plot, it was difficult for, particularly, two family carers, to interact with the computational notebook because of its technical user experience. For example, F2 explained that the notebook could be perceived as exclusionary:
“If you want to be inclusive, make it easy to consume. The best phone apps that we have are the simple ones, all the hard work is done behind the scenes, and the participants just have to focus on the question that's been asked, rather than how they interact with the notebook itself.” (F2I)
Difficulty with aspects of the software, such as navigating the document and executing code cells, made participants initially feel like they had not the required expertise and agency to engage with the computational notebook. F3 explained how the presentation of the environment, such as the code in cells, made them “feel like a complete fish out of water with this notebook because of this coding that I can see that I don't understand”(F3I). The code in the cells had been abstracted away as much as possible, it was still unfamiliar to the participant to the extent that it had a negative effect on their experience, and consequently, their understanding. F3 continued, “If [the code] was taken out I'd have probably quite a different feeling about it,” and went on to discuss how the design of the notebook didn't seem to have much consideration for non-data science experts:
“It looked very scientific and that did put me off, if I'm honest. I just thought some of the things that were on there didn't really make sense to me, and it felt like there hadn't been any consideration for the way that some of the terminology might make the user feel.” (F3I)
Furthermore, participants struggled with running code cells in the notebook. The notebook had an issue where code cells would not execute on first load, and the notebook kernel required restarting. Young adults who had no prior experience with computational notebooks demonstrated fluency with the notebook by restarting the bugged kernel without problems. However, in the post-workshop interviews, family carers (F1, F3, F4) lacked the confidence to fix the issue on their own and required guidance to restart the kernel.
By not offering a similar user experience as typical digital tools, such as a shared document or whiteboard, the notebook environment strengthened data explanation, exploration, and visualisation affordances. However, this came at the cost of the participants’ abilities to digest and understand the information. F4 discussed how the computational notebook was harder to use than other collaborative tools that they had used before, because of the technical nature:
“It was very different to anything I'd used previously. For someone who doesn't have any kind of coding or particular IT expertise, it did feel more technical than a lot of the other ways in which you can get people to share comments. I've been to other online workshops that are more sort of focus group-like where there are virtual post it boards and that kind of thing, which is easier to get to grips with - the computational workbook felt far more technical.” (F4I)
4.2.2 Ambiguity Can Lead to Information Asymmetries
The notebook sometimes failed to communicate the intentions of activities. This miscommunication resulted in breakdowns in different ways, such as participants not taking away potential learning outcomes from activities, not understanding the information being presented in an activity, and not performing an activity as intended. Whilst most of the notebook was successful at communicating its intentions, these occasional breakdowns must be reported on.
The notebook failed to communicate the context of the risk score slider activity and explain that it was an example. During a conversation with Y6 about the risk score slider, they reported:
“I was so frustrated with [the risk score slider] because it's just not representative of diabetes management. You could be testing six times a day, but if you don't act upon it and don't inject, you're still at a high risk of going into hospital. There needed to be three different outcomes: hospitalisation for hypoglycaemia, hyperglycaemia, and DKA” (Y6I).
Y6 had missed, understandably so, a bullet point in a list of text above the risk score slider that said, ‘The model shown is an example of what can be done and does not represent diabetes education or advice.’ By missing this one line in the notebook, Y6 became disillusioned with the workshop activity. For Y6, the context of this activity completely changed from foundational learning about the concept of ML, to frustration at the perceived simplicity of the model and lack of awareness from the designers (“I might have not picked up that this wasn't a fully-fledged thing at first, so I think that's on me.” (Y6I)). Y5 also expressed how they were unsure if the risk score slider was an example: “I wasn't really sure if this was a genuine model [...], so I was slightly concerned because I'm not sure how good this model really is. I feel like just three features and a value between 0 and 1 wouldn't actually produce any meaningful results.” (Y5I).
The explorative and open-ended nature of design probes presented challenges for participants in sharing their experiences and taking part in design activities. For example, Y1 needed the notebook to go further when explaining what contribution it expected from them, to feel encouraged to participate fully:
“I would have been able to contribute more with a greater understanding of the project and where it was trying to go. I wouldn't say that I held loads back, but sometimes I felt like what we were discussing was a bit vague. I didn't really know what to say and what not to say.” (Y1I)
4.2.3 ‘One Size Fits All’ Approach to Explaining ML Does Not Meet Individual Information Needs
The notebook provided different forms of ML explanations, and while some participants were satisfied with the amount of detail that the notebook went into, a handful of participants expressed interest in more complex explanations. C3, an advocate for the given level of detail, reported: “I think anything more perhaps would have been a little bit too much, I think it might have lost my focus from the information that I was there to provide” (C3I). Y1 recognised that there was more to be said on the topic of ML, but deemed that the notebook had struck a healthy balance of detail and simplification:
“There's obviously a lot more that goes on, but I don't know what that is and how you'd visualise that, so for me personally, that's a decent level of detail and explanation, but also a good level of simplification because I feel like if you put too much maths in it, I wouldn't understand.” (Y1I)
While our intention was to keep the notebook elements simple to facilitate understanding in all users, F2 thought that making the content of the risk score slider more complex, by adding more variables, would be insightful: “With just the three dials, it is pretty easy to get to the bottom of what you're trying to say. Would I have got more or less insight out of having more or less of these [variables] to choose from? I'd have probably gone for more just to see.” (F2I)
Y5 observed that the simplicity of the examples could be counter-intuitive and create a narrative that ML was not capable of handling more than a few parameters, because none of the early examples incorporated more than three variable inputs. This would cause participants without prior understanding of ML to underestimate its power. Y5 suggested that some examples could afford to be more complex:
“In the explanation of a model, this training and testing model is very simple as well. Because this isn't very interactive, [...] it can be a bit more of a complicated image of a model to show the extent of them; rather than ‘OK, these models are always just taking in one, two or three features, and any more than that, it might not be able to handle.’ I feel like people who wouldn't be comfortable with machine learning wouldn't know [otherwise] if they haven't touched it at all before. I would assume it's quite obvious, but it was only because all the examples in the notebook are very simple, that I wasn't really sure how simple the actual models potentially were.” (Y5I)
While most of the participants who were eager for more detailed content had prior experience with ML, some participants without prior experience were still open to deeper ML explanations. When asked if they thought the explanations given in the workshop were sufficient, Y4 recognised the benefit of more information:
“I think in the workshop, perhaps because it's not like a fully-fledged finalized model that's going to be harder to do and I understand that, but I think maybe a little bit more information is a starting point. Just the basics of how these things are integrated would be useful, but I guess without a complete understanding of how a model is going to work, you can't really give a lot of information.” (Y4I)