FUTURE PROSPECTS OF ARTIFICIAL INTELLIGENCE IN EDUCATION: A PRELIMINARY ANALYSIS OF EDUCATOR PERSPECTIVES FROM FOCUS GROUPS IN JAPAN, SPAIN, AND GERMANY

Artificial intelligence in education (AIEd) is a growing field of research that has the potential to transform learning and teaching practices. The process of developing ethical AIEd applications that are fit-for-purpose requires ongoing multi-stakeholder dialogues that include students and teachers. The present study sought to explore the perspectives of teachers in higher education on the possible futures of AIEd. Using qualitative methods, we conducted focus-group discussions in three countries (Japan, Spain, and Germany) with educators working in the social sciences. We presented participants with four scenarios that each described the implementation of a hypothetical AIEd application. A descriptive thematic analysis of the focus-group transcripts identified two key themes regarding possible future AIEd applications: 1) concerns about the validity of input variables and the accuracy of output variables, and 2) the importance of students having an active role in the integration of AI in education. These contributions highlight the importance of including teachers and students in multi-stakeholder discussions to shape the future of education.


Introduction
The recent discourse on ChatGPT (e.g., Bai et al., in press;Kasneci et al., 2023;Rudolph et al., 2023) and an increasing rate of research on artificial intelligence in education (AIEd; see Zawacki-Richter et al., 2019, for review) demonstrate a growing interest in the potential for AI to transform teaching and learning (Tuomi, 2018).However, the prohibitive costs to developing cutting-edge AI tools, combined with resource disparities between stakeholders, can result in AI applications that further entrench existing inequalities (see Bender et al., 2021;Facer & Selwyn, 2021;Hu et al., 2019, for critical discussions).In response, efforts to develop policy for ethical AI have identified multi-stakeholder dialogue as one method to mitigate the risk of exacerbating societal inequities (e.g., IEEE, 2018;OECD, 2022;UNESCO, 2021).
The present project aims to contribute to this multi-stakeholder dialogue by responding to Zawacki-Richter et al.'s (2019) call "for educational perspectives on these technological developments" (p.22).We invited educators from multiple countries to a series of online focus groups to share their views on four scenarios that each describe the implementation of a hypothetical AIEd application.These scenarios correspond to four areas of AIEd research identified in Zawacki-Richter et al.'s review and were designed to capture possible outcomes of some key macro and meso variables (see Bai et al., 2022, for discussion of how scenarios were developed).While data collection for the overall project is still ongoing, the present paper takes the opportunity to explore educator perspectives from a subset of the data from educators working in the social sciences in Japan, Spain, and Germany.

Methods
We conducted a qualitative, focus-group study using scenarios or "vignettes" as the basis for discussions.As noted by Kitzinger (1995, p. 300), the focus group is "a method that facilitates the expression of criticism and the exploration of different types of solutions"; this makes focus groups well-suited for gathering perspectives on the potential benefits and challenges of AIEd applications.Using the scenarios enabled us to present potential future outcomes of AIEd in concrete terms and allowed participants to highlight the variables that they found most important.One potential disadvantage of our design is that particular details in the scenario descriptions may unduly influence participant responses and constrain the scope of the discussions.However, the use of focus groups generally allows participants more control over the direction of the discussions compared with individual interviews (Morgan, 1997).
Ethics approval for the study was granted by the University of Oldenburg and University of Lleida Ethics Committees.The development of Participant Information Sheets and Consent Forms was guided by Tolich's (2009) discussion on focus-group ethics.

Participants
The present paper analyses data from 15 educators working in social-science subjects in Japan (five participants), Spain (seven participants), and Germany (three participants).Fourteen participants completed an optional post-discussion questionnaire.The participants were diverse in terms of gender (nine identified as female and five as male), age (three between 30-39, five between 40-49, three between 50-59, and three between 60-69), and experience, with the number of years working in higher education ranging from 3-30 years (M = 16.71,SD = 9.30) and roles ranging from Research Assistant to Professor.Two participants indicated direct experience working with AIEd applications in the questionnaire, and others indicated familiarity with various AI applications and techniques in the discussions.

Materials
During each focus group, we presented participants with four scenarios that each describe the implementation of a hypothetical AIEd application.These scenarios were developed to capture some possible outcomes of key variables and correspond to four areas of AIEd applications identified in Zawacki-Richter et al.'s (2019) review.The full texts of the scenarios are presented in Table 1.

Scenario
Text (in English) S1.Profiling and prediction A prediction system is implemented in your institution to predict students' performance and their risk of dropout.The system collects a range of data from each student (for example, assignment grades, attendance, and interactions with the institution's online systems) to calculate the probability of the student achieving a particular grade in each of their courses.Teachers can view the tracked data and the system's predictions via a dashboard on the institution's virtual platforms.In addition, when a student is classified as "at risk", the system sends student support a notification and a personalized list of suggested interventions.The support staff make the final decision about which intervention to implement.The system was developed by a joint collaboration between academic researchers and a private company.It is provided to your institution for a reduced subscription cost in exchange for pseudonymized student data that the company uses to improve the system's performance.S2.

Assessment and evaluation
An automated essay scoring system is implemented in an introductory course in your field.The system is recommended for courses with large classes and is trained with a subset of essays that are hand marked by teachers.These hand-marked essays can be randomly sampled from the submitted essays or reused from a previous year.The system then analyses the remaining unmarked essays and automatically assigns a grade to each essay, along with a confidence rating.Essays marked low confidence are flagged for teachers to review.The system was developed by a private company.It is provided to your institution for a reduced subscription cost in exchange for pseudonymized student essays that the company uses to improve the system's performance.

S3. Adaptive systems and personalisatio n
As part of a pilot study, a multi-function learning management platform is implemented in your institution.The system collects a range of data from each student (for example, assignment grades, interactions with the institution's online and learning systems, and use of campus facilities) to monitor their performance and development and make recommendations for personalized learning paths, student support services, and future courses.Teachers can monitor students' progress via a dashboard and override the system's recommendations if they disagree.At the end of each semester, teachers and developers conduct a review of the system's performance and adjust the system's settings in a quasi-experimental design.In addition, students may opt-out and withdraw their data at any point.The system is being developed by a joint collaboration between academic researchers and a private company.During its development, the system is provided to your institution free of charge in exchange for pseudonymized student data that the company uses to improve the system's performance.S4.Intelligent tutoring system An intelligent tutoring system is implemented in an introductory (first-year) course in your field.The system includes a range of inbuilt learning tasks that cover basic and advanced concepts.
The system analyses student performance on each task (for example, time taken to complete task, types of errors made, and performance in similar tasks) to give personalized feedback and decide on the next appropriate task.The system sends teachers a notification if the student is stuck or struggling with a particular task or concept.The system was developed by a university research group, and trained with data collected from the researchers' institution.It is provided to your institution free of charge as open-source software (i.e., an Open Educational Resource) and no student data is sent outside of your institution to improve the system's performance.

Procedure
Focus groups started with a brief introduction, the collection of informed consent, and two warm-up questions.Participants were then presented with the scenarios and asked to share their perspectives on the possible benefits and challenges of each AIEd application, and what they would change if they were involved in implementing such a system.The scenarios served as the basis for discussion and participants were free to deviate and discuss aspects of AIEd that they found most important.We then invited participants to share their general thoughts on AIEd and, after the focus group, to complete a brief questionnaire to provide basic demographic information.
All focus-group discussions were conducted online and lasted between 80 and 120 min.The focus group in Spain was conducted in Spanish and the focus groups in Japan and Germany were conducted in English.All discussions were recorded and transcribed using arbitrary labels to de-identify participants.The discussion in Japan was transcribed by hand.The other discussions were first transcribed by speech-to-text models (Microsoft Teams in Spain, vosk-model-en-us-0.221 in Germany) and then checked and corrected manually; the Spanish transcript was then translated to English by the second author.As the present analysis focuses on what participants said, rather than how they said it, the extracts below were lightly cleaned to correct grammar and remove false starts and non-semantic sounds (e.g., "um", "ah").

Data analysis
We used thematic analysis (Braun & Clarke, 2013) as it allowed us flexibility in identifying and interpreting patterns from the transcripts.As we wanted to focus on what the participants found most important (i.e., what elements they chose to highlight during the discussions), we chose a descriptive approach that allowed for disjuncture between the questions posed to the participants and the themes identified from the transcripts (Braun & Clarke, 2006).

Results
The focus-group discussions generated rich data and covered a range of different issues.In the present paper, we limit ourselves to exploring two of the central themes identified in the data: 1) input and output variables in the hypothetical AIEd applications, and 2) student agency.Participant contributions are identified below by country (NH, ES, and DE for Japan, Spain, and Germany, respectively), participant (n), and scenario (S1-4; see Table 1).

Inputs and Outputs
Participants expressed concern about the validity of input variables used in the AI systems and the possible effects of confounding variables.The following contribution succinctly captures the question of whether the input variables can accurately measure constructs that are pedagogically meaningful: ES6-S4: What I still find a little bit creaky, and hence my scepticism about artificial intelligence, is the variables that we use, right?The time it takes to complete a task: how can you assess whether the student has been devoting all their time to completing the task or whether they have gone for a coffee and then come back?
In addition to concerns about the variables used as inputs, participants also expressed concern that other relevant variables would be overlooked.For example, the following contribution contrasted human and AI decision making, and specifically their ability to seek and integrate contextual information: NH1-S1: I mean the AI won't care that the person had COVID or whatever, but I would care, and I said "OK now I'll give you something else to make up that".So you can't really just leave it up to the AI because the AI is not going to find out what was wrong, why the student missed those assignments or didn't come to class and things like that.
Another participant further problematised the issue of validity by critiquing the use of historical data to predict outcomes in a dynamic environment: DE1-S3: I would be sceptical about the output of it.Like, it is influenced from past data and maybe students reacted like to… kind of a labour market that is different than the next year and it recommends maybe things that are not as relevant anymore.
The response above also raises questions about the accuracy and value of the output of AI systems, a point developed further in the following: DE2-S4: The question is how do you interpret it?And how do you put meaning into it?And I have huge doubts if you get something meaningful from the system, or if it's reduced to "well he didn't work today".And then how do you actually draw conclusions, and what do you do with that?
In exploring how the outputs of AI could be used, and misused, participants expressed concern about the effects on their teaching practices.For examples: ES6-S2: We have to be careful, also, with the amount of information that can come to us.We can be overloaded by this information.
NH4-S1: I also find it problematic in the sense that these data may kind of relax advisors [who] have less interaction with students and too much dependency on data and give some bias for decision making on the university side.

Student Agency
Participants also expressed concern about how the AI systems might affect students and argued that students should have a more active role in the future of AIEd.This theme encompasses the issues of informed consent and how students control whether their data are used to train and fine-tune AI systems: ES2-S1: I am concerned that the idea of... well, is there informed consent on the part of the student body?Are they just passive actors who are subjected to this technological intervention or are they somehow counted on?Can I, as an individual agent, decide to say "no", that my data will not be collected?
Other participants echoed this sentiment and questioned whether it was possible for students to withdraw their consent: DE3-S3: If the student… chooses to opt out at a certain point, but his data is already gone to the company in this anonymised [form], there's no way of getting it back and opting out of being a part of the product development, even though it's anonymised.
Furthermore, another participant argued that some AIEd applications may lead students to feel implicitly coerced to provide private, personal data: DE1-S4: I thought about how… how maybe vulnerable it makes students, because when such a system is in place, they would maybe feel coerced to disclose personal information to me, about why they are not performing well.They should not be encouraged to tell me, for example, that they struggle with something, that they have kids, maybe they don't want to disclose, but struggle because of that and it's their right and their privacy not to tell me.They don't have to give up that information, and I would hate if such a system, makes them think they have to explain to me why they're not performing.
In addition to concerns about data privacy and informed consent, participants also expressed concern that the use of AIEd applications could result in unintended, negative effects on student behaviour.This was especially salient in the discussions of Scenario 1 (profiling and prediction): NH1-S1: The disadvantage of that is that it would actually make students focus on outcome rather than doing the work to reach the goal.
DE3-S1: It runs the risk of becoming a self-fulfilling prophecy because students […] maybe they didn't feel like they were at risk of failing the class because they learn in a different pace, or whatever, and because an AI tells them "You're at risk of failing" they might fail or, might see it as an actual prediction.
Part of this concern relates to a broader issue of how AI-informed decisions may influence or constrain a student's pedagogical journey.For example: NH3-S3: It may hinder the creativity, you know, some other interests may arise or by having conversation with students, great ideas might pop up.But by having substantive information that suggests future courses and learning path that may confine the creative mind of students as well as you know us as a mentor of that process.
Similarly, other participants highlighted the inherent variation between different students and argued that AIEd applications designed to optimize student performance may only be tailored for students who "follow the model path" (NH4).This sentiment is further articulated by: DE2-S4: It always boils down for me to having this kind of optimal student picture, and again there's so many premises that we put into it and how should a student of today be? […] I have seen quite a couple of students who basically failed a semester, but then woke up and realized "okay I've now done this part of my life experience and now I can really sit down […] and do my studies" and get really, really good marks, and this is all kind of equalled out in such a system because it's simply this kind of Bilderbuch, idealized student who, from the first day just learns hard.
In contrast to the concerns raised above, participants were more positive when considering possibilities for students to be active participants in the scenarios.This included the potential for students to use the AIEd applications as self-monitoring tools or aids to making their own decisions.For examples: DE3-S1: One of the options could be that students would have to opt in to use it, so that the students who feel "I need someone to track my progress, or I need that extra little bit of help, or the push in the right direction if I'm close to failing the class", they can opt into using it.
ES4-S3: In terms of scenarios, it [S3, adaptive systems and personalisation] seems to me to be the model that is the most respectful, above all with the issue of the students' agency, right?And well, I would just like to emphasise that, of course, this type of situation makes sense insofar as it can complement or help students' decision-making.

Discussion
Across focus groups conducted in three countries, educators working in the social sciences articulated rich and nuanced views on possible applications of AI in education.Two of the major themes in the discussions can be summarised as: 1) concerns about the validity of variables that are fed into AI systems and the accuracy of information produced by AI, and 2) the importance of students taking an active role in deciding how they interface with, and provide data to, the AI systems.Together, these themes highlight the importance of engaging students and teachers in participatory AI research (Prabhakaran et al., 2022) and multi-stakeholder discussions during the design, development, and implementation of AI technologies (Bai et al., 2022, in press).
While the two themes explored in the present paper highlight educator concerns about AIEd, it would be misleading to portray their perspectives as overall negative towards AIEd.Indeed, participants frequently expressed enthusiastic and nuanced views on the benefits and affordances of the AIEd applications, including the hope that AIEd applications can be used to ensure fairer assessment, encourage individualised learning pathways, and provide better access to support systems.These and other themes will be explored further once data collection for the project is completed.
While the present analysis takes a descriptive approach to foreground the perspectives of our participants, future analyses of the completed dataset can adopt a more theoretically informed approach (Braun & Clarke, 2006, 2013) to explore directly how educators perceive the variables that informed the scenarios (Bai et al., 2022, in press).Even within the limited data reported here, the discussions have made contact with issues such as AI ethics (see e.g., du Boulay, 2022), accountability (e.g., Porayska-Pomsta & Rajendran, 2019), automation bias (e.g., Skitka et al., 2000), and construct validity (e.g., Perelman, 2020).However, more data are required to represent accurately the diversity of educator perspectives.Therefore, the present project is gathering more data from educators working in other subjects (e.g., Science, Technology, Engineering, and Mathematics) and in three additional countries (Turkey, Korea, and the USA).
As we argued previously, "the activities involved in shaping the future are collective and require an ongoing exchange of ideas" (Bai et al., 2022, p. 63).To further these collective efforts, we provide the full text of the scenario descriptions in Table 1 and invite interested readers to use and/or modify these scenarios for their own research.We hope that these and future discussions with students and teachers will contribute to ongoing multi-stakeholder dialogues and help to shape a more-inclusive future.
Authors' note: This project is funded by the Volkswagen Foundation and the Ministry of Science and Culture in Lower Saxony (2N3743).We thank Cengiz Hakan Aydin, Aras Bozkurt, Junhong Xiao, and Patricia Slagter van Tryon for their active collaboration and contributions to the scenarios in Table 1.We also thank Andrea Broens for insightful discussions on qualitative research, and Wolfgang Müskens and Sreekari Vogeti for valuable feedback on previous versions of this paper.

Table 1 :
Text of four hypothetical scenarios used in the focus groups