key: cord-0712628-wzkba58p
authors: Rosenberg, Joshua M.; Staudt Willet, K. Bret
title: Balancing’ privacy and open science in the context of COVID-19: a response to Ifenthaler & Schumacher (2016)
date: 2020-11-17
journal: Educ Technol Res Dev
DOI: 10.1007/s11423-020-09860-8
sha: bf0c42f3227163de7d71b9a69ea45ecae2bbc716
doc_id: 712628
cord_uid: wzkba58p

Privacy and confidentiality are core considerations in education, while at the same time, using and sharing data—and, more broadly, open science—is increasingly valued by editors, funding agencies, and the public. This manuscript responds to an empirical investigation of students’ perceptions of the use of their data in learning analytics systems by Ifentahler and Schumacher (Educational Technology Research and Development, 64: 923-938, 2016). We summarize their work in the context of the COVID-19 pandemic and the resulting shift to digital modes of teaching and learning by many teachers, using the tension between privacy and open science to frame our response. We offer informed recommendations for educational technology researchers in light of Ifentahler and Schumacher’s findings as well as strategies for navigating the tension between these important values. We conclude with a call for educational technology scholars to meet the challenge of studying learning (and disruptions to learning) in light of COVID-19 while protecting the privacy of students in ways that go beyond what Institutional Review Boards consider to be within their purview.

Privacy and confidentiality are core considerations in education and should be protected at all costs. Also, sharing data-and, more broadly, open science-is increasingly valued by editors, funding agencies, and the public (van der Zee and Reich 2018). For instance, during the COVID-19 pandemic, sharing data may have enabled researchers to address questions not directly related to the original purposes for which the data were collected (Doughton 2020).

While core considerations, privacy often exists in tension with the utility of learning analytics systems (Chen and Zhu 2019) . For example, during COVID-19, schools have rapidly shifted to digital modes of teaching and learning, prompting researchers to evaluate the effectiveness of this response. At such a time of crisis, data about student learning, including data collected through learning analytics systems, are especially important for understanding disparities caused or amplified by the pandemic (Kuhfeld and Tarasawa 2020) . Although emergency modes of learning may invite more educational technologies into the classroom or altogether change our understanding of what constitutes "a classroom," the increased presence of these technologies also renews questions about how educational data are collected and used (Watters 2020) . Data, even data collected for valorous purposes in learning analytics or open science, can be misused by those with power toward destructive ends (D'Ignazio and Klein 2020).

In an empirical study, Ifenthaler and Schumacher (2016) asked students about their perceptions of privacy in learning analytics. A key insight from this research-similar to findings from Fiesler and Proferes' (2018) study of participants' privacy in empirical research-is that many of the practices in which researchers engage are not desired by student participants. Additionally, students did not perceive all collected data equally: there was variability in which data sources students were comfortable sharing. For instance, although more than 80% of respondents said they would not mind their course enrollment data to be used in learning analytics systems, less than 25% wanted their parents' educational level to be used, and less than 10% were comfortable with their medical data being used. Lundberg et al. (2019) highlighted a broader tension between privacy and open science that further illustrates the import of Ifenthaler and Schumacher's (2016) learning analytics study. Due to calls for scientists to share their work in an open way to build trust in findings, Lundberg et al. (2019) wrestled with how much data to share and with whom. They did not prescribe a recipe to follow; instead, their solution was deeply contextualized in their project, discipline, and understanding of the risks to participants. Lundberg et al. developed a decision-making model for balancing privacy and open science (Fig. 1) , creating a tiered system of sharing some data publicly and other data through a gated application system.

In summary, Ifenthaler and Schumacher's (2016) highlighted the importance of privacy considerations in learning analytics, and Lundberg et al. (2019) presented a way to move forward with open science values while also respecting the privacy and rights of participants.

Our key implication is that educational technology designers and researchers should consider open science and privacy as values that need to be balanced in their work. The tension between open science and privacy parallels the tension between the utility of learning analytics and the privacy of students (Chen and Zhu 2019). Whether one is carrying out empirical research or deploying a learning analytics system, balancing begins with a deep understanding of the specifics of the context (Greenhalgh et al. 2021; Kimmons and Veletsianos 2018), including-as Ifenthaler and Schumacher (2016) point out-what students think of the specific data being collected.

We offer several thoughts on how researchers can achieve this balance in their work. First, researchers should choose to listen to students, especially when collecting sensitive data for learning analytics or empirical research. Instructional designers and researchers should also keep in mind that some data are more private than others (Ifenthaler and Schumacher 2016) and assume that participants are not inclined to share health-related data (e.g., daily wellness checks, COVID-19 diagnostic tests). Both instructional designers and researchers should be clear in communicating to teachers and students what data they are collecting, as well as offer opportunities for participants to learn more and give feedback regarding their wishes for how such data are used. These recommendations are likely more stringent than what the Family Educational Rights and Privacy Act (FERPA) or an Institutional Review Board (IRB) may require. Therefore, instructional designers and researchers should consider adherence to these guidelines as a necessary-but-insufficient step in protecting participants' privacy.

Second, to date, open science is not widespread in education (van der Zee and Reich 2018). However, there are important reasons for educational technology researchers and designers to share data and related materials more openly than in the past. At the same time, participants' privacy is paramount and must not be compromised. In our recent work (Greenhalgh et al. 2020 ; Staudt Willet and Carpenter 2020), we have publicly shared code for analysis on GitHub and data on Open Science Framework (osf.io), but did so in different ways, depending on the nature of the data we used. We deemed Twitter data to be more sensitive and thus created a carefully anonymized version of the dataset (Greenhalgh et al. 2020) . We also created an application process for other researchers to request access to the original data. For access, we required a project description and strategies for protecting the privacy of participants in the data. In contrast, in a different study, we shared Reddit data in their original form, because the platform norm is that Reddit users' profiles are not typically identifiable with their offline identities (Staudt Willet and Carpenter 2020). Minimizing participants' risks may require sharing no data; maximizing open science requires sharing all data. However, in many cases, neither extreme is tenable. A way forward is to Lundberg et al.'s (2019) model for balancing risk to respondents with societal (science) benefits. Figure reproduced from Lundberg et al. (2019) balance privacy and open science, holding risks to participants as one consideration and the benefits of sharing data as another.

Finally, researchers should seek out guidelines, such as Prinsloo and Slade's (2018) for the informed consent process. They argued that researchers should consider how certain the outcome of a learning analytics application is, and how much risk is posed to participants, to determine how the consent process should take place. Future work may extend Lundberg et al's (2019) model for balancing privacy and open science. Future work should also build upon Chen and Zhu's (2019) findings and call to balance the practical utility of learning analytics with students' rights by articulating strategies and guidelines for how learning analytics systems can do this. The targeted guidance available concerning the consent process is a start. Furthermore, a rubric that addresses multiple stages of the learning analytics process may be especially helpful for scholars-including ourselves-seeking to balance the benefits of openness with the importance of protecting students' privacy in their work.

Data on student learning are especially important during educational disruptions related to COVID-19. However, schools' collection of health-related data from teachers and students poses new risks. Even before the current pandemic, scholars from learning analytics (Arnold et al. 2020 ) and educational technology (Krutka et al. 2019; Greenhalgh et al. 2021; Kimmons and Veletsianos 2018) have called for researchers to consider privacy issues beyond what is in the purview of FERPA or the IRB. The contribution of Ifenthaler and Schumacher's (2016) work to these renewed ethical conversations in learning analytics is to ensure students' preferences are taken into consideration, especially when applying Lundberg et al.'s (2019) model for balancing privacy and open science.

Learning analytics principles of use: Making ethics actionable

Towards value-sensitive learning analytics design

Data feminism

Drugs touted by Trump, blood from recovered patients: Seattle scientists seek coronavirus cures. The seattle times

Participant" perceptions of Twitter research ethics

Considerations for using social media data in learning design and technology research

Identifying multiple learning spaces within a single teacher-focused Twitter hashtag

Student perceptions of privacy principles for learning analytics

Public internet data mining methods in instructional design, educational technology, and online learning research

Foregrounding technoethics: Toward critical perspectives in technology and teacher education

The COVID-19 slide: What summer learning loss can tell us about the potential impact of school closures on student academic achievement. Northwest Evaluation Association

Privacy, ethics, and data access: A case study of the fragile families challenge

Student consent in learning analytics: The devil in the details?

Teachers on Reddit? Exploring contributions and interactions in four teaching-related subreddits

The ed-tech imaginary

Conflict of interest The authors have no potential conflicts of interest to disclose.Ethical approval This research did not involve human participants; informed consent was not applicable.

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Joshua M. Rosenberg (PhD, Michigan State University) is an assistant professor of STEM education and faculty fellow at the Center for Enhancing Education in Mathematics and Sciences at the University of Tennessee, Knoxville. His research focuses on how learners think of and with data, particularly in science education settings. Visit Joshua's website, https://joshuamrosenberg.com, to learn more, and connect with him on Twitter: @jrosenberg6432.K. Bret Staudt Willet researches networked learning at the intersection of information science and teacher education. Specifically, he has been exploring teacher networks and how social media platforms support induction and self-directed professional learning for K-20 educators. He is a Ph.D. candidate in Educational